dnadna.nets
DNADNA’s neural nets used for model training.
Module Attributes
Alias for backwards-compatibility; use just |
Classes
|
CNN is a basic convolutional neural network. |
|
CustomCNN is a convolutional neural network that infers demographic parameters from a SNP matrix and its associated vector of positions. |
Alias for backwards-compatibility; use just |
|
|
MLP is a basic fully connected network. |
|
Base class for DNADNA neural nets. |
|
SPIDNA is a convolutional neural network that infers evolutionary parameters. |
|
Sub-part of the SPIDNA network. |
- class dnadna.nets.CNN(n_outputs)[source]
Bases:
dnadna.nets.Network
CNN is a basic convolutional neural network. It can be used as a baseline or for testing dnadna
Task
Regression / Classification
- Constraints
min_snp (400)
max_snp (400)
min_indiv (50)
max_indiv (50)
Warning
None
- Parameters
n_outputs (int) – number of parameters or classes to infer
- forward(x)[source]
The forward function of the network (see
torch.nn.Module
for more details). Should accept a batch of SNP matrices as input (may be in either concat format (where the position array has the SNP matrix concatenated to it) or product format (where the position array is multiplied by the SNP matrix).If the operation of the net depends on which format the input is in, its
__init__
method should accept aconcat
argument. It will be passedTrue
orFalse
by the network trainer depending on which format the inputs are in.
- class dnadna.nets.CustomCNN(n_snp, n_indiv, n_outputs, concat=True)[source]
Bases:
dnadna.nets.Network
CustomCNN is a convolutional neural network that infers demographic parameters from a SNP matrix and its associated vector of positions. The number of SNP is predefined and fixed.
The network is based on multiple 2D convolution filters of mixed sizes.
Task
Regression
- Constraints
min_snp (400)
max_snp (no constraint)
min_indiv (50)
max_indiv (50)
Warning
None
Notes
This net was used to predict population sizes through time. It is called “custom CNN” in Sanchez et al., and was referred to in earlier versions of this code as “SPIDNA1”.
Publication
T. Sanchez, J. Cury, G. Charpiat, et F. Jay, « Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation », Mol Ecol Resour, p. 1755‑0998.13224, juill. 2020, doi: 10.1111/1755-0998.13224.
- Parameters
- forward(x)[source]
The forward function of the network (see
torch.nn.Module
for more details). Should accept a batch of SNP matrices as input (may be in either concat format (where the position array has the SNP matrix concatenated to it) or product format (where the position array is multiplied by the SNP matrix).If the operation of the net depends on which format the input is in, its
__init__
method should accept aconcat
argument. It will be passedTrue
orFalse
by the network trainer depending on which format the inputs are in.
- dnadna.nets.DNADNANet
Alias for backwards-compatibility; use just
dnadna.nets.Network
instead.alias of
dnadna.nets.Network
- class dnadna.nets.MLP(n_snp, n_indiv, n_outputs, concat=True)[source]
Bases:
dnadna.nets.Network
MLP is a basic fully connected network. It can be used as a baseline or for testing dnadna.
Task
Regression / Classification
- Constraints
min_snp (no constraints)
max_snp (no constraints)
min_indiv (no constraints)
max_indiv (no constraints)
Warning
None
- Parameters
- forward(x)[source]
The forward function of the network (see
torch.nn.Module
for more details). Should accept a batch of SNP matrices as input (may be in either concat format (where the position array has the SNP matrix concatenated to it) or product format (where the position array is multiplied by the SNP matrix).If the operation of the net depends on which format the input is in, its
__init__
method should accept aconcat
argument. It will be passedTrue
orFalse
by the network trainer depending on which format the inputs are in.
- class dnadna.nets.Network[source]
Bases:
torch.nn.modules.module.Module
,dnadna.utils.plugins.Pluggable
Base class for DNADNA neural nets.
All neural nets, including user-defined neural nets in plugins, must use this base class, as it adds the net to the registry of nets known by the software. Sub-modules used by the net but that are not meant for use on their own should still use
torch.nn.Module
as their base class.- abstract property forward
The forward function of the network (see
torch.nn.Module
for more details). Should accept a batch of SNP matrices as input (may be in either concat format (where the position array has the SNP matrix concatenated to it) or product format (where the position array is multiplied by the SNP matrix).If the operation of the net depends on which format the input is in, its
__init__
method should accept aconcat
argument. It will be passedTrue
orFalse
by the network trainer depending on which format the inputs are in.
- classmethod get_schema()[source]
Returns a schema pairing the
network.name
property with the validnetwork.params
associated with that network (which may be very broad if theNetwork
subclass does not specify itsNetwork.schema
).
- schema = {}
Schema for the network’s
net_params
, the section in the training config for parameters the net instance should be instantiated with (e.g.n_snp
,n_indiv
in the case ofSPIDNA1
, among others).It can be either a string containing the name (without the
.yml
extension) of a schema in the default schema path (for built-in nets) or adict
representing the schema.If left empty, the
net_params
simply won’t be validated when loading the config.
- class dnadna.nets.SPIDNA(n_blocks, n_features, n_outputs)[source]
Bases:
dnadna.nets.Network
SPIDNA is a convolutional neural network that infers evolutionary parameters.
This network’s predictions are invariant to the permutation of individuals in the SNP matrix and adaptive to the number of individuals.
It is also adaptive to the number of SNPs, although it is recommended to evaluate the performance when the number of SNPs varies, because batch normalization is applied.
Task
Regression
- Constraints
min_snp (400)
max_snp (no constraint)
min_indiv (2)
max_indiv (no constraint)
Warning
None
Notes
This net has been used to predict population sizes through time.
It is called “SPIDNA batch normalization” in Sanchez et al. 2020 and was trained with data cropped to a fixed number of SNPs (400) and individuals (50) without padding. However this is not a constraint of the architecture, which is adaptive to the number of individuals and SNPs.
Publication
T. Sanchez, J. Cury, G. Charpiat, et F. Jay, « Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation », Mol Ecol Resour, p. 1755‑0998.13224, juill. 2020, doi: 10.1111/1755-0998.13224.
- Parameters
- forward(x)[source]
The forward function of the network (see
torch.nn.Module
for more details). Should accept a batch of SNP matrices as input (may be in either concat format (where the position array has the SNP matrix concatenated to it) or product format (where the position array is multiplied by the SNP matrix).If the operation of the net depends on which format the input is in, its
__init__
method should accept aconcat
argument. It will be passedTrue
orFalse
by the network trainer depending on which format the inputs are in.
- class dnadna.nets.SPIDNABlock(n_outputs, n_features)[source]
Bases:
torch.nn.modules.module.Module
Sub-part of the SPIDNA network. The number of SPIDNABlock inside the SPIDNA network is defined by the n_blocks parameter.
- forward(x, output)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.