上传宏基因组数据到SRA
上传宏基因组数据到SRA
关于attribute和metadata
SUBMITTER
GENERAL INFO
PROJECT INFO
BIOSAMPLE TYPE
BIOSAMPLE ATTRIBUTES
SRA METADATA
FILES
REVIEW & SUBMIT
百度可知,各⼤教程都在教你怎么注册。但其实⽐较爱卡壳的是填attribute和metadata的表,即使蒙混过关最后还是得回头改这两表。
先说attribute:
下载excel之后真是由衷佩服NCBI,excel都能玩出花⼉来。这表头长这样:
*sample_name sample_title bioproject_accession *organism host isolation_source *collection_date *geo_loc_name *lat_lon ref_biomaterial rel_to_oxygen samp_collect_device samp_mat_process samp_size source_material_id description
肛肛打⼀开始就给出了each identical的sample name,但⼀直报error。⽹传在sample name前⾯加数字就能过关的,或者删掉黄⾊列的,反正肛肛2020年尝试了就不⾏。倒是在最后加⼀列replicate就⾏了。对,默认的表⾥没有这⼀列,⾃⼰加上去会⾃动变成黄⾊(选填)。
最后上传给了个Warning:
Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information could not be resolved automatically and will require a taxonomy consult.
然后进⼊metadatasubmitting
sample_name library_ID title library_strategy library_source library_selection library_layout platform instrument_model design_description filetype filename filename2 filename3 filename4 assembly fasta_file
这个⽐较user friendly,⼤部分都可选,不⽤⾃⼰慢慢填。file name和description也是必填。
⼈家已经红字标出来了:你下载时是excel但只能上传TSV。肛肛⼀个惯性把excel甩上去结果当然是Error:Excel workbook does not contain worksheet “SRA_data”
之后是这些:Warning:Your file(s) are missing an extension and do not exactly match the uploaded files on your metadata sheet. Please review and update these files.
Warning:Paired-end data is usually submitted in two files. Please be sure to enter both matching filenames on the same row of the spreadsheet using the extra ‘Filename’ columns provided.
Warning:If you are submitting metagenomic and/or metatranscriptomic data sets, sequence data should be split by each sample barcode, for individual data files.
接着传fastq file,报了个warining:File(s) you included on your spreadsheet are missing: xxxx。Please add the missing files or if they’re included, click ‘Continue’ to proceed with archive extraction which will detect all files.
基本上就是肛肛上传的所有⽂件都不能与metadata对应-。-没办法,回去改metadata,把filename全部加上格式后缀。
新bug:Error:You must be the owner of the BioSample XXX or in a group with the owner of this BioSample to use it in this submission. If the BioSample belongs to a different group, you will need to create a new BioSample.
不知道具体问题在哪,应该是attribute的问题,去把attribute的表⾥空⽩的列删了再试试。
在metadata和file之间多次磨合之后,终于submit成功了,SRA显⽰还是processing。肛肛估计后续还有不知道什么要改呢。
最后,NCBI的⽹是真的慢,⽽且不稳定,就试错传表都能搞两天。当年在海外⽤blast已经深受其害。谁让⼈家既⽜啤⼜垄断。虽然mendeley也出了database,但认受性⼀般。没办法,肛肛只能继续杠了。
-
------------------------------来更新分界线------------------------------
有需要可以康康咯~

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。