plink 软件中 --check-sex参数

2022/7/12 23:23:57

本文主要是介绍plink 软件中 --check-sex参数,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!

 

1、--check-sex用于验证性别信息是否可靠。 检测依据是对x染色体进行纯合度F值统计。

 

001、

root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls
HapMap_3_r3_5.bed  HapMap_3_r3_5.bim  HapMap_3_r3_5.fam         ## 检测性别异常个体
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --bfile HapMap_3_r3_5 --check-sex
PLINK v1.90b6.26 64-bit (2 Apr 2022)           www.cog-genomics.org/plink/1.9/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --bfile HapMap_3_r3_5
  --check-sex

16007 MB RAM detected; reserving 8003 MB for main workspace.
1430443 variants loaded from .bim file.
165 people (80 males, 85 females) loaded from .fam.
112 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 112 founders and 53 nonfounders present.
Calculating allele frequencies... done.
Warning: 179 het. haploid genotypes present (see plink.hh ); many commands
treat these as missing.
Total genotyping rate is 0.997899.
1430443 variants and 165 people pass filters and QC.
Among remaining phenotypes, 56 are cases and 56 are controls.  (53 phenotypes
are missing.)
--check-sex: 23424 Xchr and 0 Ychr variant(s) scanned, 1 problem detected.
Report written to plink.sexcheck .

 

提取异常个体:

root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls
HapMap_3_r3_5.bed  HapMap_3_r3_5.bim  HapMap_3_r3_5.fam  plink.hh  plink.log  plink.sexcheck
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head plink.sexcheck
    FID       IID       PEDSEX       SNPSEX       STATUS            F
   1328   NA06989            2            2           OK     -0.01184
   1377   NA11891            1            1           OK            1
   1349   NA11843            1            1           OK            1
   1330   NA12341            2            2           OK     -0.01252
   1444   NA12739            1            1           OK            1
   1344   NA10850            2            2           OK      0.01496
   1328   NA06984            1            1           OK            1
   1463   NA12877            1            1           OK            1
   1418   NA12275            2            2           OK      -0.1028
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '$(NF - 1) != "OK"' plink.sexcheck
    FID       IID       PEDSEX       SNPSEX       STATUS            F
   1349   NA10854            2            1      PROBLEM         0.99   ## 有问题的个体

 

002、验证

root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls
HapMap_3_r3_5.bed  HapMap_3_r3_5.bim  HapMap_3_r3_5.fam
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# wc -l HapMap_3_r3_5.fam
165 HapMap_3_r3_5.fam
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk 'BEGIN {print 1/(165 * 2)}'
0.0030303
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --bfile HapMap_3_r3_5 --chr 23 --maf 0.0030303 --recode --out chr23
PLINK v1.90b6.26 64-bit (2 Apr 2022)           www.cog-genomics.org/plink/1.9/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to chr23.log.
Options in effect:
  --bfile HapMap_3_r3_5
  --chr 23
  --maf 0.0030303
  --out chr23
  --recode

16007 MB RAM detected; reserving 8003 MB for main workspace.
31490 out of 1430443 variants loaded from .bim file.
165 people (80 males, 85 females) loaded from .fam.
112 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 112 founders and 53 nonfounders present.
Calculating allele frequencies... done.
Warning: 179 het. haploid genotypes present (see chr23.hh ); many commands
treat these as missing.
Total genotyping rate is 0.990832.
8066 variants removed due to minor allele threshold(s)
(--maf/--max-maf/--mac/--max-mac).
23424 variants and 165 people pass filters and QC.
Among remaining phenotypes, 56 are cases and 56 are controls.  (53 phenotypes
are missing.)
--recode ped to chr23.ped + chr23.map ... done.
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# ls
HapMap_3_r3_5.bed  HapMap_3_r3_5.bim  HapMap_3_r3_5.fam  chr23.hh  chr23.log  chr23.map  chr23.ped
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23.map
23      rs311168        0       2720840
23      rs311173        0       2722661
23      rs311183        0       2724756
23      rs311187        0       2726346
23      rs5939329       0       2728135
23      rs3672  0       2743641
23      rs3671  0       2743668
23      rs901321        0       2747282
23      rs5982872       0       2754765
23      rs5939351       0       2760122
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# sed 's/^.//' chr23.map -i
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23.map
3       rs311168        0       2720840
3       rs311173        0       2722661
3       rs311183        0       2724756
3       rs311187        0       2726346
3       rs5939329       0       2728135
3       rs3672  0       2743641
3       rs3671  0       2743668
3       rs901321        0       2747282
3       rs5982872       0       2754765
3       rs5939351       0       2760122
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# plink --file chr23 --het --out chr23_3
PLINK v1.90b6.26 64-bit (2 Apr 2022)           www.cog-genomics.org/plink/1.9/
(C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to chr23_3.log.
Options in effect:
  --file chr23
  --het
  --out chr23_3

16007 MB RAM detected; reserving 8003 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (23424 variants, 165 people).
--file: chr23_3-temporary.bed + chr23_3-temporary.bim + chr23_3-temporary.fam
written.
23424 variants loaded from .bim file.
165 people (80 males, 85 females) loaded from .fam.
112 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 112 founders and 53 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.990283.
23424 variants and 165 people pass filters and QC.
Among remaining phenotypes, 56 are cases and 56 are controls.  (53 phenotypes
are missing.)
--het: 23424 variants scanned, report written to chr23_3.het .
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# head chr23_3.het
    FID       IID       O(HOM)       E(HOM)        N(NM)            F
   1328   NA06989        16128    1.621e+04        23373     -0.01142
   1377   NA11891        23384    1.622e+04        23384            1
   1349   NA11843        23412    1.624e+04        23412            1
   1330   NA12341        16132    1.622e+04        23388     -0.01209
   1444   NA12739        23405    1.623e+04        23405            1
   1344   NA10850        16327    1.622e+04        23385      0.01538
   1328   NA06984        23419    1.624e+04        23419            1
   1463   NA12877        23387    1.622e+04        23387            1
   1418   NA12275        15464     1.62e+04        23352      -0.1023
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | head
sex         FID       IID       O(HOM)       E(HOM)        N(NM)            F
2          1328   NA06989        16128    1.621e+04        23373     -0.01142   ## f值一致
1          1377   NA11891        23384    1.622e+04        23384            1
1          1349   NA11843        23412    1.624e+04        23412            1
2          1330   NA12341        16132    1.622e+04        23388     -0.01209
1          1444   NA12739        23405    1.623e+04        23405            1
2          1344   NA10850        16327    1.622e+04        23385      0.01538
1          1328   NA06984        23419    1.624e+04        23419            1
1          1463   NA12877        23387    1.622e+04        23387            1
2          1418   NA12275        15464     1.62e+04        23352      -0.1023
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | awk '$1 == 1 && $NF < 0.8'
root@DESKTOP-1N42TVH:/home/test/GWA_tutorial/1_QC_GWAS/test# awk '{print $5}' chr23.ped | sed '1i sex' | paste - chr23_3.het | awk '$1 == 2 && $NF > 0.2'
2          1349   NA10854         6632         4748         6651         0.99    ## 问题个体

 



这篇关于plink 软件中 --check-sex参数的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!


扫一扫关注最新编程教程