HTA - An open-source software for assigning heads and tails to SMILES in polymerization reactions

Polymers are versatile materials with a wide range of applications. The improvement of polymer properties rises the importance on the way that the repeating units are connected (head-to-tail,head-to-head,tail-to-tail) to build the polymer structure since it directly influences the morphology, chain topology and consequently its properties. Artificial intelligence (AI) based approaches are beginning to impact several domains of human life, science and technology. Polymer informatics is one such domain where AI and machine learning (ML) tools are being used in the efficient development, design and discovery of polymer. One key enabling factor for the essential foundations for Polymer Informatics is the machine-readable polymer representation. Polymer have been represented in a string format with special characters used to tag the head and tail positions indicating where the linking bond happens between repeat units. Available tools to assign the head and tail position limits its applicability in a broad sense. In this work we show a new tool to assign the head and tail atoms for a given monomer. From a database of 206 polymer precursors curated from the literature, our algorithm correctly predicted the class of 201 data points, which represents 97.6% of accuracy and regarding the the head and tail assignment, correctly assigned the positions for 188 data points, which translates to 91.3% of accuracy.

Identifier
Source https://archive.materialscloud.org/record/2025.6
Metadata Access https://archive.materialscloud.org/xml?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:materialscloud.org:1896
Provenance
Creator de Souza Ferrari, Brenda; Giro, Ronaldo; B. Steiner, Mathias
Publisher Materials Cloud
Publication Year 2025
Rights info:eu-repo/semantics/openAccess; Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode
OpenAccess true
Contact archive(at)materialscloud.org
Representation
Language English
Resource Type Dataset
Discipline Materials Science and Engineering