【蛋⽩结构AI预测时代】在Colab上使⽤AlphaFold2教程Alphafold2 开源了这将进⼀步推动学界预测和设计蛋⽩。
可以看到官⽹上给出的结果图,结构⽣物学实验解得的结构与预测的别⽆⼆致。
——————————— 最新最省⼒的⽅法————————————————
直接点击,输⼊序列,全部运⾏就可以预测了。
—————————分割线下的可以不⽤看啦(⽼)————————————
但模型的数据与预测所需资源过⼤,跑起来也⽐较费时,这⾥⽤Sergey Ovchinnikov 提供的“”的代码,快速跑通看⼀下它的效果。先是导⼊和安装各种库
%%bash
git clone github/deepmind/alphafold.git
mv alphafold alphafold_
mv alphafold_/alphafold .
%%bash
wget -qnc leapis/alphafold/alphafold_params_2021-07-14.tar
tar -xf alphafold_params_2021-07-14.tar
rm alphafold_params_2021-07-14.tar
mkdir params
mv params_* params/
%%bash
pip -q install biopython
pip -q install dm-haiku
pip -q install ml-collections
pip -q install mock
pip -q install py3Dmol
from typing import Dict
import os
import mock
import numpy as np
import pickle
import py3Dmol
from alphafoldmon import protein
from alphafold.data import pipeline
from alphafold.data import templates
del import data
del import config
del import model
定义⼀个结构预测函数。
def predict_structure(
fasta_path:str,
fasta_name:str,
output_dir_base:str,
data_pipeline: pipeline.DataPipeline,
model_runners: Dict[str, model.RunModel],
random_seed:int):
"""Predicts structure using AlphaFold for the given sequence."""
output_dir = os.path.join(output_dir_base, fasta_name)
if not ists(output_dir): os.makedirs(output_dir)
msa_output_dir = os.path.join(output_dir,'msas')
if not ists(msa_output_dir): os.makedirs(msa_output_dir)
# Get features.
feature_dict = data_pipeline.process(input_fasta_path=fasta_path, msa_output_dir=msa_output_dir)
# Run the models.
for model_name, model_runner in model_runners.items():
processed_feature_dict = model_runner.process_features(feature_dict, random_seed=random_seed)    prediction_result = model_runner.predict(processed_feature_dict)
unrelaxed_protein = protein.from_prediction(processed_feature_dict,prediction_result)
unrelaxed_pdb_path = os.path.join(output_dir,f'unrelaxed_{model_name}.pdb')
with open(unrelaxed_pdb_path,'w')as f:
f._pdb(unrelaxed_protein))
将query_sequence更改为你⾃⼰的序列。
# CHANGE THIS LINE TO YOUR FAVE SEQUENCE! :D
query_sequence ="GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE"
# fake template
output_templates_sequence =[]
output_confidence_scores =[]
templates_all_atom_positions =[]
templates_all_atom_masks =[]
for _ in query_sequence:
templates_all_atom_positions.append(import pickle
templates_all_atom_masks.sidue_constants.atom_type_num))
output_templates_sequence.append('-')
output_confidence_scores.append(-1)
output_templates_sequence =''.join(output_templates_sequence)
templates_aatype = sidue_constants.sequence_to_onehot(output_templates_sequence,
template_features ={'template_all_atom_positions': np.array(templates_all_atom_positions)[None], 'template_all_atom_masks': np.array(templates_all_atom_masks)[None],
'template_sequence':[f'none'.encode()],
'template_aatype': np.array(templates_aatype)[None],
'template_confidence_scores': np.array(output_confidence_scores)[None],
'template_domain_names':[f'none'.encode()],
'template_release_date':[f'none'.encode()]}
# fake pipeline for testing
data_pipeline_mock = mock.Mock()
data_pipeline_urn_value ={
**pipeline.make_sequence_features(sequence=query_sequence,
description="none",
num_res=len(query_sequence)),
**pipeline.make_msa_features(msas=[[query_sequence]],
deletion_matrices=[[[0]*len(query_sequence)]]),
**template_features
}
fasta_path = os.path.join('target.fasta')
with open(fasta_path,'wt')as f:
f.write(f">A\n{query_sequence}")
fasta_name ='none'
out_dir ="."
# load model_1
model_runners ={}
model_name ="model_1"
model_config = del_config(model_name)
model_config.data.eval.num_ensemble =1
model_params = _model_haiku_params(
model_name=model_name, data_dir=".")
model_runner = model.RunModel(model_config, model_params)
model_runners[model_name]= model_runner
预测结构,这个蛋⽩⽰例约有68个氨基酸,耗时约2分钟。
predict_structure(
fasta_path=fasta_path,
fasta_name=fasta_name,
output_dir_base=".",
data_pipeline=data_pipeline_mock,
model_runners=model_runners,
random_seed=0)
查看结构,并可以在⽂件栏下载pdb格式的蛋⽩⽂件。
p = py3Dmol.view(js='/build/3Dmol.js')
p.addModel(open("none/unrelaxed_model_1.pdb",'r').read(),'pdb')  p.setStyle({'cartoon':{'color':'spectrum'}})
p.show()
p = py3Dmol.view(js='/build/3Dmol.js')
p.addModel(open("none/unrelaxed_model_1.pdb",'r').read(),'pdb')  p.setStyle({'cartoon':{'color':'spectrum'},'stick':{}})
p.show()

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。