| Products/Services Used | Details | Operation |
|---|---|---|
| Gene Synthesis> | The maps are provided in data S2. The coding sequences of the 46 text-mined CEs were synthesized (GenScript, WuXi Biologics, and Atantares), and the protein sequences are listed in table S4. | Get A Quote |
Mining and expanding high-quality genetic parts for synthetic biology and bioengineering are urgent needs in the research and development of next-generation biotechnology. However, gene mining has relied on sequence homology or ample expert knowledge, which fundamentally limits the establishment of a comprehensive genetic part catalog. In this work, we propose SYMPLEX (synthetic biological part mining platform by large language model-enabled knowledge extraction), a universal gene-mining platform based on large language models. We applied SYMPLEX to mine enzymes responsible for messenger RNA (mRNA) capping, a key process in eukaryotic posttranscriptional modification, and obtained thousands of diverse candidate... More