SCM

[#2664] export.merlin produces incorrect ped files.

Date:
2013-04-04 15:35
Priority:
4
State:
Open
Submitted by:
Daniel Taliun (dtaliun)
Assigned to:
Lennart Karssen (lckarssen)
Resolution:
Accepted As Bug
Operating System:
Linux (64bit)
Severity:
major
Hardware:
All
Version:
v1.8-0
Component:
GenABEL
URL:
Summary:
export.merlin produces incorrect ped files.

Detailed description
GenABEL v. 1.7-4 (February 22, 2013)

When exporting a subset of individuals from gwaa.data with export.merlin, the genotypes in ped files are incorrect.

Additionally, ped files produced by analogous commands
export.merlin(df[ids, snps], datafile=data_file_name, pedfile=ped_file_name, mapfile=map_file_name, order=T, fixstrand="+", stepids=1)
and
export.merlin(df[ids, snps], datafile=data_file_name, pedfile=ped_file_name, mapfile=map_file_name, order=T, fixstrand="+", stepids=100)
are not identical.
When stepids=1, the ped file seems to be correct.

This behavior is observed only with subset of ids. When gwaa.data is exported completely, the everything is ok.

Comments:

Message  ↓
Date: 2015-07-30 05:35
Sender: Nadezhda Belonogova

Sorry for the appearance of the previous post, all paragraph break signs are gone. I am not yet familiar with this message service.

Date: 2015-07-30 05:24
Sender: Nadezhda Belonogova

For example, correct genotypes in merlin.ped (exported with stepids = n) are:
1 id1 0 0 2 0 T/T C/C G/G G/G G/G ...
2 id2 0 0 2 0 T/C C/C G/G G/G G/G ...
3 id3 0 0 2 0 T/T C/C G/G G/G G/G ...
4 id4 0 0 2 0 T/T C/C G/G G/T G/G ...
5 id5 0 0 1 0 T/T C/C G/G G/G G/G ...
6 id6 0 0 2 0 T/C C/C G/G G/G G/G ...
7 id7 0 0 2 0 T/T C/C G/G G/G G/G ...
8 id8 0 0 2 0 T/T C/C G/G T/T G/G ...
9 id9 0 0 2 0 T/C C/C G/G G/G G/G ...
10 id10 0 0 2 0 T/T C/C G/G G/G G/G ...
...
With default stepids = 100, genotypes look strange for all markers starting from the second one:
1 id1 0 0 2 0 T/T C/C G/G G/G G/C ...
2 id2 0 0 2 0 T/C 0/0 G/G G/G G/G ...
3 id3 0 0 2 0 T/T 0/0 G/G G/G G/G ...
4 id4 0 0 2 0 T/T 0/0 G/G G/G G/G ...
5 id5 0 0 1 0 T/T C/C G/G G/G G/C ...
6 id6 0 0 2 0 T/C C/C 0/0 G/G G/G ...
7 id7 0 0 2 0 T/T C/C 0/0 G/G G/C ...
8 id8 0 0 2 0 T/T C/C 0/0 G/G G/G ...
9 id9 0 0 2 0 T/C C/C G/G G/G G/C ...
10 id10 0 0 2 0 T/T C/C G/G 0/0 G/C ...
...
The number of genotypes is correct, *.map and *.dat files are ok and keep the same order of markers.
In addition, I tried to reproduce a bug on a small sample and found that it appears only with n > 1000.

Date: 2015-07-29 20:08
Sender: Lennart Karssen

Thank you Nadezhda, for reporting this here. Also thanks for providing a workaround.

Can you be a bit more specific about "genotypes are totally wrong"? Do I understand correctly that genotypes are written to the merlin file, but that they are not the correct genotypes?
Is the total number of genotypes correct?

I have re-opend the bug and increased it's severity.

Date: 2015-07-29 04:06
Sender: Nadezhda Belonogova

Just found out on my data that genotypes exported by export.merlin() are totally wrong (GenABEL_1.8-0). Setting 'stepids = n' where n is the sample size seems to work fine.

Date: 2013-06-03 12:56
Sender: Lennart Karssen

Addendum to my previous comment: the unit test was added in svn r.1242.

Date: 2013-06-03 12:48
Sender: Lennart Karssen

I tried to confim this bug, but wasn't able to (GenABEL v1.7-6).
Just to be sure I added a check test.export.merlin.bug2664() in the unit tests to make sure it doesn't pop up again.

Marking this bug closed. Feel free to re-open it if the problem appears again.

Date: 2013-05-16 16:32
Sender: Lennart Karssen

Since export.plink() calls export.merlin() when not writing tped files, bug #2055 may be related.

Attached Files:

Changes

Field Old Value Date By
ResolutionWorks For Me2015-07-29 20:08lckarssen
SeverityNone2015-07-29 20:08lckarssen
Versionv1.7-42015-07-29 20:08lckarssen
status_idClosed2015-07-29 20:08lckarssen
close_date2013-06-03 12:482015-07-29 20:08lckarssen
priority32015-07-29 20:08lckarssen
status_idOpen2013-06-03 12:48lckarssen
close_dateNone2013-06-03 12:48lckarssen
assigned_tonone2013-06-03 12:48lckarssen
ResolutionNone2013-06-03 12:48lckarssen
HardwareNone2013-06-03 12:48lckarssen
VersionNone2013-06-03 12:48lckarssen
Thanks to:
Vienna University of Economics and Business Powered By FusionForge