GenClone: a computer program to analyze genotypic data, test for clonality and describe spatial clonal organization.
Arnaud-Haond Sophie and Belkhir Khalid
«Team MAREE» - CCMAR, Algarve University, FCMA, Gambelas, 8005-139 Faro, PORTUGAL
«Génome, Populations, Interactions »-Université Montpellier II, Place Eugène Bataillon ; 34090 Montpellier Cedex, FRANCE

GENCLONE is designed for studying clonality and its spatial components using genotype data with molecular markers from haploid or diploid organisms.

GenClone 2.0 performs the following tasks:

  1. Discriminates distinct multilocus genotypes (MLGs), and uses permutation and re-sampling approaches to test for the reliability of sets of loci and sampling units for estimating genotypic and genetic diversity. (This is also useful for non-clonal organisms.)
  2. Computes statistics to test for clonal propagation or clonal identity of replicates.
  3. Computes various indices describing genotypic diversity.
  4. Summarizes the spatial organization of MLGs with adapted spatial autocorrelation methods and clonal subrange estimates.

What changes from GenClone 1.1 to GenClone 2.0

The new features form Arnaud-Haond et al (in press in Molecular Ecology, doi: 10.1111/j.1365-294X.2007.03535.x) and Rozenfeld et al. (2007) namely the Pareto distribution, spectrum of microsatellites distances, aggregation index and Edge efect index are now implemented

The procedure checking for missing data has been corrected and is now functional for datasets from haploid organisms.

Planned for the next version: change some activeX procedres in order to increase the maximum number of loci, presently 42 only.

What changed in GenClone 1.1

Coherence of the data is now checked when opening each infile : when the number of individual and number of loci do not correspond to the number of columns and lines, an error message will inform user that the file should be corrected.

Central coordinates are now estimated without the assumption that the transect starts at [0,0].

An ‘outfile.arp’ can be extracted with Arlequin format

GenClone 1.0 suffered a bug in the estimates of pgen(fis), due to the Round Robin methods under certain conditions (very rare alleles disappear from the allelic frequency estimates in this procedure, which is expected, but resulted in a bug in some cases) and this is now fixed. The very rare alleles as considered as having a frequency p<(1/(ploidy x number of distinct MLG).

A bug in the estimate of distances in terms of number of alleles occurred depending on the file format and has been fixed.

Autocorrelation with some number of distant classes resulted, without miscalculation, in the last classes not being represented (ie empty). This is also fixed.

The manual showed number of loci and ploidy level in the infile with an inversion, this is corrected.

Below some technical information to use GenClone:

I. How to install GenClone

Open the downloaded zipped file and extract all files to a chosen folder.
To install the program, double click on the file "setup.exe", and follow instructions.

After installation and before running the program for the first time, it is suggested that you load and run the example infile to verify that the program is working properly and that you obtain correct results (as provided in the outfiles examples).

The program first displays the basic information from the input file: various analyzes are listed in the menu bar

II. How to prepare the infile

One infile must be prepared for each population or sampling site. The infile can be prepared with Microsoft EXCEL (as described below) and saved as *.txt format (save the file as "text file with tab delimited").

The infile must include the following information*:

The first row contains:

  • cell 1: N the number of individuals
  • cell 2&3: maximum x and y length of the transect (in the case you do not have one or both coordinates fill this column with ‘999’)
  • cell 4: L thel the number of loci
  • cell 5: the ploidy level encoded as 1 for haploids and 2 for diploids organisms. If this cell is not filled, the program will assume diploidy per default.
  • starting from cell 6 (regardless of whether the ploidy level is provided), each cell may contain the name of the locus used. If these cells are not filled, the default name ‘loci x’ will be given to all loci, with x its number appearance.

The second through N lines:

    • column 1 : sampling units name
    • column 2&3: the x- and y-coordinates of the sampling units (in case you do not have the coordinates fill this column with ‘999’)
    • each following column contains the genotype for 1 locus/cell, with alleles coded by 3-digit, preferably corresponding to the allele length. One should avoid using allele identifiers beginning with a 0 (e.g., it is better to code any allele length superior to 099), because values starting with 0 will result in failures. Each cell should contain 6-digits if diploid and 3-digits if haploid.
    • missing data should be encoded as 999. One should be aware, by keeping individuals with missing data, membership of the individual to a given multi-locus genotype group cannot be ascertained.

*Note: please be careful your computer may be using '. ' or ', ' as a decimal separator, use the correct code to designate sampling coordinates, otherwise the spatial component would be estimated with rounded coordinates

for 40 sampling units of a diploid organism, with a transect of 80 x 20 metres, 8 loci,

- prepared on Excel:

- saved as a txt (opened with TextPad):

III. How to run GENCLONE

Double click on GenClone.exe

Open your "infile.txt":

1. Section "Test":

Example after using the combination procedure for loci (left), or asking for the matrix of allelic distances:

2. Section "MLG" to calculate richness and diversity indices

3. Section "Spatial Components" (with an example of autocorrelation analysis):

