plink 软件中 --check-sex参数
2022/7/12 23:23:57
本文主要是介绍plink 软件中 --check-sex参数,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
1、--check-sex用于验证性别信息是否可靠。 检测依据是对x染色体进行纯合度F值统计。
001、
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls HapMap_3_r3_5.bed HapMap_3_r3_5.bim HapMap_3_r3_5.fam ## 检测性别异常个体 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --bfile HapMap_3_r3_5 --check-sex PLINK v1.90b6.26 64-bit (2 Apr 2022) www.cog-genomics.org/plink/1.9/ (C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to plink.log. Options in effect: --bfile HapMap_3_r3_5 --check-sex 16007 MB RAM detected; reserving 8003 MB for main workspace. 1430443 variants loaded from .bim file. 165 people (80 males, 85 females) loaded from .fam. 112 phenotype values loaded from .fam. Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 112 founders and 53 nonfounders present. Calculating allele frequencies... done. Warning: 179 het. haploid genotypes present (see plink.hh ); many commands treat these as missing. Total genotyping rate is 0.997899. 1430443 variants and 165 people pass filters and QC. Among remaining phenotypes, 56 are cases and 56 are controls. (53 phenotypes are missing.) --check-sex: 23424 Xchr and 0 Ychr variant(s) scanned, 1 problem detected. Report written to plink.sexcheck .
提取异常个体:
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls HapMap_3_r3_5.bed HapMap_3_r3_5.bim HapMap_3_r3_5.fam plink.hh plink.log plink.sexcheck root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head plink.sexcheck FID IID PEDSEX SNPSEX STATUS F 1328 NA06989 2 2 OK -0.01184 1377 NA11891 1 1 OK 1 1349 NA11843 1 1 OK 1 1330 NA12341 2 2 OK -0.01252 1444 NA12739 1 1 OK 1 1344 NA10850 2 2 OK 0.01496 1328 NA06984 1 1 OK 1 1463 NA12877 1 1 OK 1 1418 NA12275 2 2 OK -0.1028 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '$(NF - 1) != "OK"' plink.sexcheck FID IID PEDSEX SNPSEX STATUS F 1349 NA10854 2 1 PROBLEM 0.99 ## 有问题的个体
002、验证
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls HapMap_3_r3_5.bed HapMap_3_r3_5.bim HapMap_3_r3_5.fam root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# wc -l HapMap_3_r3_5.fam 165 HapMap_3_r3_5.fam root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk 'BEGIN {print 1/(165 * 2)}' 0.0030303 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --bfile HapMap_3_r3_5 --chr 23 --maf 0.0030303 --recode --out chr23 PLINK v1.90b6.26 64-bit (2 Apr 2022) www.cog-genomics.org/plink/1.9/ (C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to chr23.log. Options in effect: --bfile HapMap_3_r3_5 --chr 23 --maf 0.0030303 --out chr23 --recode 16007 MB RAM detected; reserving 8003 MB for main workspace. 31490 out of 1430443 variants loaded from .bim file. 165 people (80 males, 85 females) loaded from .fam. 112 phenotype values loaded from .fam. Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 112 founders and 53 nonfounders present. Calculating allele frequencies... done. Warning: 179 het. haploid genotypes present (see chr23.hh ); many commands treat these as missing. Total genotyping rate is 0.990832. 8066 variants removed due to minor allele threshold(s) (--maf/--max-maf/--mac/--max-mac). 23424 variants and 165 people pass filters and QC. Among remaining phenotypes, 56 are cases and 56 are controls. (53 phenotypes are missing.) --recode ped to chr23.ped + chr23.map ... done. root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls HapMap_3_r3_5.bed HapMap_3_r3_5.bim HapMap_3_r3_5.fam chr23.hh chr23.log chr23.map chr23.ped root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23.map 23 rs311168 0 2720840 23 rs311173 0 2722661 23 rs311183 0 2724756 23 rs311187 0 2726346 23 rs5939329 0 2728135 23 rs3672 0 2743641 23 rs3671 0 2743668 23 rs901321 0 2747282 23 rs5982872 0 2754765 23 rs5939351 0 2760122 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# sed 's/^.//' chr23.map -i root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23.map 3 rs311168 0 2720840 3 rs311173 0 2722661 3 rs311183 0 2724756 3 rs311187 0 2726346 3 rs5939329 0 2728135 3 rs3672 0 2743641 3 rs3671 0 2743668 3 rs901321 0 2747282 3 rs5982872 0 2754765 3 rs5939351 0 2760122 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --file chr23 --het --out chr23_3 PLINK v1.90b6.26 64-bit (2 Apr 2022) www.cog-genomics.org/plink/1.9/ (C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to chr23_3.log. Options in effect: --file chr23 --het --out chr23_3 16007 MB RAM detected; reserving 8003 MB for main workspace. .ped scan complete (for binary autoconversion). Performing single-pass .bed write (23424 variants, 165 people). --file: chr23_3-temporary.bed + chr23_3-temporary.bim + chr23_3-temporary.fam written. 23424 variants loaded from .bim file. 165 people (80 males, 85 females) loaded from .fam. 112 phenotype values loaded from .fam. Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 112 founders and 53 nonfounders present. Calculating allele frequencies... done. Total genotyping rate is 0.990283. 23424 variants and 165 people pass filters and QC. Among remaining phenotypes, 56 are cases and 56 are controls. (53 phenotypes are missing.) --het: 23424 variants scanned, report written to chr23_3.het . root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23_3.het FID IID O(HOM) E(HOM) N(NM) F 1328 NA06989 16128 1.621e+04 23373 -0.01142 1377 NA11891 23384 1.622e+04 23384 1 1349 NA11843 23412 1.624e+04 23412 1 1330 NA12341 16132 1.622e+04 23388 -0.01209 1444 NA12739 23405 1.623e+04 23405 1 1344 NA10850 16327 1.622e+04 23385 0.01538 1328 NA06984 23419 1.624e+04 23419 1 1463 NA12877 23387 1.622e+04 23387 1 1418 NA12275 15464 1.62e+04 23352 -0.1023 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | head sex FID IID O(HOM) E(HOM) N(NM) F 2 1328 NA06989 16128 1.621e+04 23373 -0.01142 ## f值一致 1 1377 NA11891 23384 1.622e+04 23384 1 1 1349 NA11843 23412 1.624e+04 23412 1 2 1330 NA12341 16132 1.622e+04 23388 -0.01209 1 1444 NA12739 23405 1.623e+04 23405 1 2 1344 NA10850 16327 1.622e+04 23385 0.01538 1 1328 NA06984 23419 1.624e+04 23419 1 1 1463 NA12877 23387 1.622e+04 23387 1 2 1418 NA12275 15464 1.62e+04 23352 -0.1023 root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | awk '$1 == 1 && $NF < 0.8' root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | awk '$1 == 2 && $NF > 0.2' 2 1349 NA10854 6632 4748 6651 0.99 ## 问题个体
这篇关于plink 软件中 --check-sex参数的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-05-15PingCAP 黄东旭参与 CCF 秀湖会议,共探开源教育未来
- 2024-05-13PingCAP 戴涛:构建面向未来的金融核心系统
- 2024-05-09flutter3.x_macos桌面os实战
- 2024-05-09Rust中的并发性:Sync 和 Send Traits
- 2024-05-08使用Ollama和OpenWebUI在CPU上玩转Meta Llama3-8B
- 2024-05-08完工标准(DoD)与验收条件(AC)究竟有什么不同?
- 2024-05-084万 star 的 NocoDB 在 sealos 上一键起,轻松把数据库编程智能表格
- 2024-05-08Mac 版Stable Diffusion WebUI的安装
- 2024-05-08解锁CodeGeeX智能问答中3项独有的隐藏技能
- 2024-05-08RAG算法优化+新增代码仓库支持,CodeGeeX的@repo功能效果提升