Skip to content

how to identify CDR after numbering sequence #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
renyongzhe opened this issue Jun 21, 2022 · 4 comments
Open

how to identify CDR after numbering sequence #39

renyongzhe opened this issue Jun 21, 2022 · 4 comments

Comments

@renyongzhe
Copy link

Hi ,all.
Because SCALOP will miss H-CDR3, is there annother tool could identify CDR region? After numbering the antibody sequence, whether it's the right way to extract CDR region according to the position of CDR Definitions ? Any other suggestions would be appreciated.

@gciaberi
Copy link

Once you have the numbered antibody sequence then you can simply extract the residues with a number that is within the CDR3 limits given in the definition table you linked to, so something like
[residue for residue, resi in numbered_residues if CDR3_start_index <= resi <= CDR3_end_index].

You just have to be careful to correctly catch residues that have a letter collated to their residue number. Suppose you have an IMGT numbered heavy chain sequence with the following format:
'[... (T, 93), (S, 94), (D, 94A), (H, 94B), (K, 95), ...]

Then you have to make sure you correctly extract the D and H and get TSDHK...

Other than that I don't think there are other edge cases to deal with, maybe just check that the CDR3 sequence you have extracted has a minimum length of 5 or something like that.

@renyongzhe
Copy link
Author

renyongzhe commented Jul 4, 2022

Thanks for your interpretation. Annother format in IMGT numbered heavy chain sequence like below where CDR1 located in H26-H33. How to get the correct CDR1 sequence? The output of SCALOP is GFSINGSW.
[(26, S), (27, G), (28, F), (29, S), (30, I), (31, -), (32, -), (33, -), (34, -), (35, N), (36, G), ...]

@gciaberi
Copy link

gciaberi commented Jul 4, 2022

You should double check the CDR definitions table you are using because it seems wrong, in the official IMGT website the CDR1 is defined as H27-38.

@renyongzhe
Copy link
Author

You should double check the CDR definitions table you are using because it seems wrong, in the official IMGT website the CDR1 is defined as H27-38.

You are right. But that make me confused about this definitions which many people and documents cite it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants