SCM

Forum: applied-research

Monitor Forum | Start New Thread Start New Thread
Bigenerational ACE [ Reply ]
By: Will Beasley on 2013-11-19 22:41
[forum:40082]

ArchiveAce-MdanFakeRandom.pdf (57) downloads
Excitement 2: thorough coverage
In the previous post's attachment, you can see that there are 24k pairs (instead of 43k). That’s because not all subjects have a value for ‘adult height’ (and that’s mostly because a lot of Gen2 subjects are too young). The second attachment shows the ACE of an outcome that’s actually just a randomly generated value. There’s two points to this. First, the ACE estimates are where they should be (ie, close to 0, 0, 1). Second, there are 42,025 pairs for `RFull`, which is exactly the same number of pairs with nonmissing values of `RFull`. It didn’t occur to me until just now that we’ve classified have 98.25% of everyone; until now I've been concentrating on the generations separately. That’s nuts. If you remember there are 42,773 possible pairs. I feel comfortable claiming that if the researcher has a phenotype value for both members of a pair, there’s a 99+% chance we have an `RFull` for it.

Bigenerational ACE of Adult Height [ Reply ]
By: Will Beasley on 2013-11-19 22:39
[forum:40081]

ArchiveAce-MdanHeight.pdf (63) downloads
### Excitement 1: consistent & desirable ACE estimates
Here’s the ACE for every possible pair in the NLSY79 who has a value for adult height. As I added the extra relationship paths, the A^2 moved to .74, which in a direction that we want. It is .69 using only Gen1Housemates; it is .79 using only Gen2Sibs.

Another nice thing is that our current versions (ie, the first four rows in Table 1) look great next to the previous version of the links (ie, `RImplicit2004`), which was A^2=.66. Our best guess is .74, and even better is that our current `RImplicit` (which ignores the explicit items) is .67.

Perhaps the best news of all is that the RExplicit is .74. For Gen1, the current explicit algorithm has almost no overlap with the 2004 version of the links. It almost exclusively uses the 1979 Roster and the 2006 explicit items. The previous Gen1 links used neither –they used almost entirely different NLSY survey items. That’s remarkable test-retest consistency, right? I’m also impressed that old we improved the full sample by only 36% (=31k/42k). When they last did Gen1 in the mid-90s, they were restricted to using only first pairs and slide rulers.

And Gen1Housemates are the most important links, because they affect the Gen2Cousins and the AuntNieces (which also include uncles & nephews). That’s almost half of the 42k links:
> table(Links79Pair$RelationshipPath)
Gen1Housemates Gen2Siblings Gen2Cousins ParentChild AuntNiece
5302 11088 4995 11504 9884

If you want details about the outcome variable, the adult heights were standardized after splitting on three variables: generation, gender, and age. Exact details for Gen1 and Gen2 are specified in these two documents. The html is probably the easiest to interpret.

https://github.com/LiveOak/NlsyLinksDetermination/tree/92c0c0d8d9e525180e85bbd86f2d0f6086e7fd08/ForDistribution/Outcomes/Gen1Height

https://github.com/LiveOak/NlsyLinksDetermination/tree/92c0c0d8d9e525180e85bbd86f2d0f6086e7fd08/ForDistribution/Outcomes/Gen2Height

Thanks to:
Vienna University of Economics and Business Powered By FusionForge