1 |
Graphon: A Comparison of Grapheme-to-phoneme Conversion Performance between an Automated System and Primary Grade StudentsJoubarne, Colette January 2015 (has links)
Grapheme-to-phoneme conversion is a necessary part of reading, whether by an automated system or by children. Automated methods play a key role in text-to-speech and automated speech recognition systems. Children learning to read develop grapheme-to-phoneme (G2P) conversion rules that they use extensively until they build up their orthographic lexicon.
Various solutions have been proposed for G2P conversion, each addressing specific problems and evaluated for different languages. In this thesis, I introduce a simple approach to G2P conversion that achieves good results, and compare these results to those of a study of children’s reading accuracy in the primary grades. The comparison highlights areas of weakness in the children’s reading skills, as well as particular phonemes for which the G2P system has difficulty. As part of the process, I also compare and discuss the wide range of discrepancies that exist between various French corpora.
|
2 |
Grapheme-to-phoneme transcription of English words in Icelandic textÁrmannsson, Bjarki January 2021 (has links)
Foreign words, such as names, locations or sometimes entire phrases, are a problem for any system that is meant to convert graphemes to phonemes (g2p; i.e.converting written text into phonetic transcription). In this thesis, we investigate both rule-based and neural methods of phonetically transcribing English words found in Icelandic text, taking into account the rules and constraints of how foreign phonemes can be mapped into Icelandic phonology. We implement a rule-based system by compiling grammars into finite-state transducers. In deciding on which rules to include, and evaluating their coverage, we use a list of the most frequently-found English words in a corpus of Icelandic text. The output of the rule-based system is then manually evaluated and corrected (when needed) and subsequently used as data to train a simple bidirectional LSTM g2p model. We train models both with and without length and stress labels included in the gold annotated data. Although the scores for neither model are close to the state-of-the-art for either Icelandic or English, both our rule-based system and LSTM model show promising initial results and improve on the baseline of simply using an Icelandic g2p model, rule-based or neural, on English words. We find that the greater flexibility of the LSTM model seems to give it an advantage over our rule-based system when it comes to modeling certain phenomena. Most notable is the LSTM’s ability to more accurately transcribe relations between graphemes and phonemes for English vowel sounds. Given there does not exist much previous work on g2p transcription specifically handling English words within the Icelandic phonological constraints and it remains an unsolved task, our findings present a foundation for the development of further research, and contribute to improving g2p systems for Icelandic as a whole.
|
3 |
Phonemic variability and confusability in pronunciation modeling for automatic speech recognitionKaranasou, Panagiota 11 June 2013 (has links) (PDF)
This thesis addresses the problems of phonemic variability and confusability from the pronunciation modeling perspective for an automatic speech recognition (ASR) system. In particular, several research directions are investigated. First, automatic grapheme-to- phoneme (g2p) and phoneme-to-phoneme (p2p) converters are developed that generate alternative pronunciations for in-vocabulary as well as out-of-vocabulary (OOV) terms. Since the addition of alternative pronunciation may introduce homophones (or close homophones), there is an increase of the confusability of the system. A novel measure of this confusability is proposed to analyze it and study its relation with the ASR performance. This pronunciation confusability is higher if pronunciation probabilities are not provided and can potentially severely degrade the ASR performance. It should, thus, be taken into account during pronunciation generation. Discriminative training approaches are, then, investigated to train the weights of a phoneme confusion model that allows alternative ways of pronouncing a term counterbalancing the phonemic confusability problem. The objective function to optimize is chosen to correspond to the performance measure of the particular task. In this thesis, two tasks are investigated, the ASR task and the KeywordSpotting (KWS) task. For ASR, an objective that minimizes the phoneme error rate is adopted. For experiments conducted on KWS, the Figure of Merit (FOM), a KWS performance measure, is directly maximized.
|
4 |
Italianising English words with G2P techniques in TTS voices. An evaluation of different modelsGrassini, Francesco January 2024 (has links)
Text-to-speech voices have come a long way in terms of their naturalness, and they are getting closer to human-sounding than ever. However, among the problems that still persist, the pronunciation of foreign words is still one of them. The experiments conducted in this thesis focus on using grapheme-to-phoneme (G2P) models to tackle the just-mentioned issue and, more specifically, to adjust the erroneous pronunciation of English words to an Italian English accent in Italian-speaking voices. We curated a dataset of words collected during recording sessions with an Italian voice actor reading general conversational sentences. We then manually transcribed their pronunciation in Italian English. In the second stage, we augmented the dataset by collecting the most common surnames in Great Britain and the United States, phonetically transcribed them with a rule-based phoneme mapping algorithm previously deployed by the company, and then manually adjusted the pronunciations to Italian English. Thirdly, by using the massively multilingual ByT5 model, a Transformer G2P model pre-trained on 100 languages, as well as its tokenizer-dependent versions T5_base and T5_small, and an LSTM with attention based on OpenNMT, we performed 10-fold cross-validation with the curated dataset. The results show that augmenting the data benefitted every model. In terms of PER, WER and accuracy, the transformer-based ByT5_small strongly outperformed its T5_small and T5_base counterparts even with a third or two-thirds of the training data. The second best performing model, the LSTM with attention one built with the OpenNMT framework, outperformed as well the T5 models, showed the second-best accuracy of our experiments and was the 'lightest' in terms of trainable parameters (2M) in comparison to ByT5 (299M) and the T5 ones (60 and 200M).
|
5 |
Data-driven augmentation of pronunciation dictionariesLoots, Linsen 03 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: This thesis investigates various data-driven techniques by which pronunciation dictionaries
can be automatically augmented. First, well-established grapheme-to-phoneme (G2P) conversion
techniques are evaluated for Standard South African English (SSAE), British English
(RP) and American English (GenAm) by means of four appropriate dictionaries: SAEDICT,
BEEP, CMUDICT and PRONLEX.
Next, the decision tree algorithm is extended to allow the conversion of pronunciations
between different accents by means of phoneme-to-phoneme (P2P) and grapheme-andphoneme-
to-phoneme (GP2P) conversion. P2P conversion uses the phonemes of the source
accent as input to the decision trees. GP2P conversion further incorporates the graphemes
into the decision tree input. Both P2P and GP2P conversion are evaluated using the four
dictionaries. It is found that, when the pronunciation is needed for a word not present
in the target accent, it is substantially more accurate to modify an existing pronunciation
from a different accent, than to derive it from the word’s spelling using G2P conversion.
When converting between accents, GP2P conversion provides a significant further increase
in performance above P2P.
Finally, experiments are performed to determine how large a training dictionary is required
in a target accent for G2P, P2P and GP2P conversion. It is found that GP2P
conversion requires less training data than P2P and substantially less than G2P conversion.
Furthermore, it is found that very little training data is needed for GP2P to perform at almost
maximum accuracy. The bulk of the accuracy is achieved within the initial 500 words,
and after 3000 words there is almost no further improvement.
Some specific approaches to compiling the best training set are also considered. By means
of an iterative greedy algorithm an optimal ranking of words to be included in the training
set is discovered. Using this set is shown to lead to substantially better GP2P performance
for the same training set size in comparison with alternative approaches such as the use of
phonetically rich words or random selections. A mere 25 words of training data from this
optimal set already achieve an accuracy within 1% of that of the full training dictionary. / AFRIKAANSE OPSOMMING: Hierdie tesis ondersoek verskeie data-gedrewe tegnieke waarmee uitspraakwoordeboeke outomaties
aangevul kan word. Eerstens word gevestigde grafeem-na-foneem (G2P) omskakelingstegnieke
ge¨evalueer vir Standaard Suid-Afrikaanse Engels (SSAE), Britse Engels (RP)
en Amerikaanse Engels (GenAm) deur middel van vier geskikte woordeboeke: SAEDICT,
BEEP, CMUDICT en PRONLEX.
Voorts word die beslissingsboomalgoritme uitgebrei om die omskakeling van uitsprake
tussen verskillende aksente moontlik te maak, deur middel van foneem-na-foneem (P2P) en
grafeem-en-foneem-na-foneem (GP2P) omskakeling. P2P omskakeling gebruik die foneme
van die bronaksent as inset vir die beslissingsbome. GP2P omskakeling inkorporeer verder
die grafeme by die inset. Beide P2P en GP2P omskakeling word evalueer deur middel van
die vier woordeboeke. Daar word bevind dat wanneer die uitspraak benodig word vir ’n
woord wat nie in die teikenaksent teenwoordig is nie, dit bepaald meer akkuraat is om ’n
bestaande uitspraak van ’n ander aksent aan te pas, as om dit af te lei vanuit die woord se
spelling met G2P omskakeling. Wanneer daar tussen aksente omgeskakel word, gee GP2P
omskakeling ’n verdere beduidende verbetering in akkuraatheid bo P2P.
Laastens word eksperimente uitgevoer om die grootte te bepaal van die afrigtingswoordeboek
wat benodig word in ’n teikenaksent vir G2P, P2P en GP2P omskakeling. Daar
word bevind dat GP2P omskakeling minder afrigtingsdata as P2P en substansieel minder as
G2P benodig. Verder word dit bevind dat baie min afrigtingsdata benodig word vir GP2P
om teen bykans maksimum akkuraatheid te funksioneer. Die oorwig van die akkuraatheid
word binne die eerste 500 woorde bereik, en n´a 3000 woorde is daar amper geen verdere
verbetering nie.
’n Aantal spesifieke benaderings word ook oorweeg om die beste afrigtingstel saam te stel.
Deur middel van ’n iteratiewe, gulsige algoritme word ’n optimale rangskikking van woorde
bepaal vir insluiting by die afrigtingstel. Daar word getoon dat deur hierdie stel te gebruik,
substansieel beter GP2P gedrag verkry word vir dieselfde grootte afrigtingstel in vergelyking
met alternatiewe benaderings soos die gebruik van foneties-ryke woorde of lukrake seleksies.
’n Skamele 25 woorde uit hierdie optimale stel gee reeds ’n akkuraatheid binne 1% van di´e
van die volle afrigtingswoordeboek.
|
6 |
Phonemic variability and confusability in pronunciation modeling for automatic speech recognition / Variabilité et confusabilité phonémique pour les modèles de prononciations au sein d’un système de reconnaissance automatique de la paroleKaranasou, Panagiota 11 June 2013 (has links)
Cette thèse aborde les problèmes de variabilité et confusabilité phonémique du point de vue des modèles de prononciation pour un système de reconnaissance automatique de la parole. En particulier, plusieurs directions de recherche sont étudiées. Premièrement, on développe des méthodes de conversion automatique de graphème-phonème et de phonème-phonème. Ces méthodes engendrent des variantes de prononciation pour les mots du vocabulaire, ainsi que des prononciations et des variantes de prononciation, pour des mots hors-vocabulaire. Cependant, ajouter plusieurs prononciations par mot au vocabulaire peut introduire des homophones (ou quasi-homophones) et provoquer une augmentation de la confusabilité du système. Une nouvelle mesure de cette confusabilité est proposée pour analyser et étudier sa relation avec la performance d’un système de reconnaissance de la parole. Cette “confusabilité de prononciation” est plus élevée si des probabilités pour les prononciations ne sont pas fournies et elle peut potentiellement dégrader sérieusement la performance d’un système de reconnaissance de la parole. Il convient, par conséquent, qu’elle soit prise en compte lors de la génération de prononciations. On étudie donc des approches d’entraînement discriminant pour entraîner les poids d’un modèle de confusion phonémique qui autorise différentes facons de prononcer un mot tout en contrôlant le problème de confusabilité phonémique. La fonction objectif à optimiser est choisie afin de correspondre à la mesure de performance de chaque tâche particulière. Dans cette thèse, deux tâches sont étudiées: la tâche de reconnaissance automatique de la parole et la tâche de détection de mots-clés. Pour la reconnaissance automatique de la parole, une fonction objectif qui minimise le taux d’erreur au niveau des phonèmes est adoptée. Pour les expériences menées sur la détection de mots-clés, le “Figure of Merit” (FOM), une mesure de performance de la détection de mots-clés, est directement optimisée. / This thesis addresses the problems of phonemic variability and confusability from the pronunciation modeling perspective for an automatic speech recognition (ASR) system. In particular, several research directions are investigated. First, automatic grapheme-to- phoneme (g2p) and phoneme-to-phoneme (p2p) converters are developed that generate alternative pronunciations for in-vocabulary as well as out-of-vocabulary (OOV) terms. Since the addition of alternative pronunciation may introduce homophones (or close homophones), there is an increase of the confusability of the system. A novel measure of this confusability is proposed to analyze it and study its relation with the ASR performance. This pronunciation confusability is higher if pronunciation probabilities are not provided and can potentially severely degrade the ASR performance. It should, thus, be taken into account during pronunciation generation. Discriminative training approaches are, then, investigated to train the weights of a phoneme confusion model that allows alternative ways of pronouncing a term counterbalancing the phonemic confusability problem. The objective function to optimize is chosen to correspond to the performance measure of the particular task. In this thesis, two tasks are investigated, the ASR task and the KeywordSpotting (KWS) task. For ASR, an objective that minimizes the phoneme error rate is adopted. For experiments conducted on KWS, the Figure of Merit (FOM), a KWS performance measure, is directly maximized.
|
7 |
Towards a Language Model for Stenography : A Proof of ConceptLangstraat, Naomi Johanna January 2022 (has links)
The availability of the stenographic manuscripts of Astrid Lindgren have sparked an interest in the creation of a language model for stenography. By its very nature stenography is low-resource and the unavailability of data requires a tool for using normal data. The tool presented in this thesis is to create stenographic data from manipulating orthographic data. Stenographic data is distinct from orthographic data through three different types manipulations that can be carried out. Firstly stenography is based on a phonetic version of language, secondly it used its own alphabet that is distinct from normal orthographic data, and thirdly it used several techniques to compress the data. The first type of manipulation is done by using a grapheme-to-phoneme converter. The second type is done by using an orthographic representation of a stenographic alphabet. The third type of manipulation is done by manipulating based on subword level, word level and phrase level. With these manipulations different datasets are created with different combinations of these manipulations. Results are measured for both perplexity on a GPT-2 language model and for compression rate on the different datasets. These results show a general decrease of perplexity scores and a slight compression rate across the board. We see that the lower perplexity scores are possibly due to the growth of ambiguity.
|
8 |
Towards a unified model for speech and language processingPloujnikov, Artem 12 1900 (has links)
Ce travail de recherche explore les méthodes d’apprentissage profond de la parole et du
langage, y inclus la reconnaissance et la synthèse de la parole, la conversion des graphèmes en
phonèmes et vice-versa, les modèles génératifs, visant de reformuler des tâches spécifiques dans
un problème plus général de trouver une représentation universelle d’information contenue
dans chaque modalité et de transférer un signal d’une modalité à une autre en se servant de
telles représentations universelles et à générer des représentations dans plusieurs modalités.
Il est compris de deux projets de recherche: 1) SoundChoice, un modèle graphème-phonème
tenant compte du contexte au niveau de la phrase qui réalise de bonnes performances et
des améliorations remarquables comparativement à un modèle de base et 2) MAdmixture, une
nouvelle approche pour apprendre des représentations multimodales dans un espace latent
commun. / The present work explores the use of deep learning methods applied to a variety of areas
in speech and language processing including speech recognition, grapheme-to-phoneme conversion,
speech synthesis, generative models for speech and others to build toward a unified
approach that reframes these individual tasks into a more general problem of finding a
universal representation of information encoded in different modalities and being able to
seamlessly transfer a signal from one modality to another by converting it to this universal
representations and to generate samples in multiple modalities. It consists of two main
research projects: 1) SoundChocice, a context-aware sentence level Grapheme-to-Phoneme
model achieving solid performance on the task and a significant improvement on phoneme
disambiguation over baseline models and 2) MAdmixture, a novel approach to learning a variety
of speech representations in a common latent space.
|
9 |
Linking Genetic Resources, Genomes and Phenotypes of Solanaceus CropsAlonso Martín, David 30 November 2024 (has links)
[ES] El impacto del cambio climático en los cultivos hortícolas es cada vez más evidente, lo que ha llevado a la pérdida y erosión de diversidad genética de manera drástica. Esto plantea importantes desafíos para la mejora de los cultivos, que requiere la exploración de los recursos fitogenéticos conservados en los bancos de germoplasma y el desarrollo de tecnologías que permitan evaluar el valor fenotípico y genotípico de estos materiales. Sin embargo, la situación actual de las colecciones de germoplasma es la existencia de duplicados no identificados entre colecciones, errores en la clasificación taxonómica, documentación insuficiente y no disponible para investigadores y mejoradores, añadido a la falta de financiación para la conservación y gestión adecuadas. Esto dificulta enormemente la utilización de estos recursos. En la presente Tesis se aborda este problema comenzando por la unificación de datos de pasaporte, fenotipado e imágenes de las principales colecciones de tomate, pimiento y berenjena en un mismo repositorio en el primer capítulo.
El segundo capítulo se centra en el desarrollo y optimización de un método de extracción de ADN genómico de alta calidad, rápido y económico que combina las ventajas del método de extracción basado en el CTAB, añadido a la purificación de los ácidos nucleicos en una matriz de sílice. Es un método universal que puede utilizarse para diferentes especies y tejidos. Se ha evaluado la eficiencia del ADN genómico resultante en diferentes plataformas de secuenciación como SPET (Single Primer Enrichment Technology) y Oxford Nanopore, generando resultados muy prometedores. Esto facilita el paso previo al genotipado de las colecciones que es la extracción de ADN.
En el tercer capítulo se aborda el genotipado de las colecciones. El elevado número de accesiones de cada cultivo, en particular el tomate, supone un problema de tipo económico, en ocasiones irresoluble. Por ello, el tercer capítulo está orientado a la evaluación del potencial de la tecnología de secuenciación SPET, más económica que otras conocidas, para el genotipado de alto rendimiento de colecciones de germoplasma de tomate y berenjena. Los resultados revelan que el genotipado SPET es una tecnología robusta y de alto rendimiento para estudios genéticos, incluyendo la posibilidad de identificación de duplicados y errores de clasificación taxonómica en las entradas conservadas en los bancos. Con la información generada en los primeros tres capítulos se establecieron las colecciones nucleares para cada cultivo, abarcando la máxima diversidad genética y fenotípica en un conjunto de 450 individuos.
Finalmente, en el cuarto capítulo, se analiza y describe la colección nuclear de tomate a nivel genético y fenotípico, mediante un enfoque basado en el establecimiento de grupos genéticos basados en su proximidad genética. El análisis de la diversidad genética y fenotípica reveló patrones de variación distintos entre diferentes grupos genéticos, contradiciendo afirmaciones anteriores que proponían una disminución en la diversidad genética como consecuencia de la mejora genética y descubriendo correlaciones entre rasgos morfológicos únicas dentro de los diferentes grupos.
En resumen, esta tesis aumenta el conocimiento y accesibilidad a las colecciones de Solanaceae en bancos de germoplasma y proporciona herramientas moleculares. Destaca la importancia de estos bancos como reservorios de diversidad genética, aunque enfrenten desafíos como datos limitados y duplicados. Estos avances sientan las bases para la conservación y programas de mejora futuros. / [CA] L'impacte del canvi climàtic en els cultius hortícoles és cada vegada més evident, la qual cosa ha portat a la dràstica pèrdua i erosió de la diversitat genètica. La reduïda diversitat genètica planteja importants reptes per a la millora dels cultius. Sent necessari l'exploració dels recursos genètics vegetals conservats en els bancs de germoplasma i el desenvolupament de tecnologies que permeten avaluar el valor fenotípic i genotípic d'aquests materials. Pel que fa a les col·leccions de germoplasma presenten duplicats no identificats entre col·leccions, errors en la classificació taxonòmica, falta de finançament per a la conservació i gestió adequades a banda de documentació insuficient i no disponible (investigadors i milloradors vegetals). En la present tesi doctoral en el primer capítol s'aborda aquest problema unificant les dades de passaport, fenotipat i imatges de les principals col·leccions de tomaca, pebre i albergínia en un mateix repositori.
El segon capítol es focalitza en el desenvolupament i optimització d'un mètode d'extracció de ADN genòmic d'alta qualitat, ràpid i econòmic que combina els avantatges del mètode d'extracció basat en el CTAB amb l'ús de matrius de sílice. El mètode desenvolupat pot utilitzar-se de manera universal per a diferents espècies i teixits vegetals. S'ha avaluat l'eficiència del ADN genòmic resultant en diferents plataformes de seqüenciació com SPET (Single Primer Enrichment Technology) i Oxford Nanopore, generant resultats molt prometedors. Això facilita el pas previ al genotipat de les col·leccions que és l'extracció d'ADN.
En el tercer capítol aborda l'optimització del procés de genotipat de les col·leccions generades. L'elevat nombre d'accessions de cada cultiu, en particular la tomaca, suposa un problema de tipus econòmic, a vegades irresoluble. Per això, aquest capítol està orientat a l'avaluació del potencial de la tecnologia de seqüenciació SPET per al genotipat d'alt rendiment de col·leccions de germoplasma de tomaca i albergínia a un preu econòmic. Els resultats revelen que el genotipat SPET és una tecnologia robusta i d'alt rendiment per a estudis genètics, incloent-hi la possibilitat d'identificació de duplicats i errors de classificació taxonòmica en les entrades conservades en els bancs de germoplasma. La informació generada va permetre establir col·leccions nuclears per a cada cultiu, abastant la màxima diversitat genètica i fenotípica en un conjunt de 450 individus.
Finalment, en el quart capítol, s'analitza i descrigué la col·lecció nuclear de tomaca a nivell genètic i fenotípic, focalitzant-se en l'establiment de grups genètics basats en la seua proximitat genètica. L'anàlisi de la diversitat genètica i fenotípica va revelar patrons de variació diferents entre diferents grups genètics, contradient afirmacions anteriors que proposaven una disminució en la diversitat genètica a conseqüència de la millora genètica. També és descobriren noves correlacions entre trets morfològics únics dins dels diferents grups. L'estudi destaca la importància d'abordar les iniciatives de millora de la tomaca tenint en compte tant la diversitat genètica com la fenotípica, amb especial èmfasi en aspectes com la grandària, la forma, el color i la qualitat del fruit.
En definitiva, els treballs realitzats en aquesta tesi doctoral augmenten, d'una banda, el coneixement i l'accessibilitat a les principals col·leccions de solanàcies conservades en els bancs de germoplasma. Per un altre, generen eines moleculars que permeten l' avaluació genotípica de les col·leccions analizades. En resum, aquests avanços suposen una base per al futur, proporcionant informació valuosa per a la pròpia conservació de les col·leccions i el seu ús en programes de millora. / [EN] The impact of climate change on horticultural crops is increasingly evident, leading to drastic loss and erosion of genetic diversity. This poses significant challenges for crop improvement, which requires the exploration of plant genetic resources conserved in germplasm banks and the development of technologies. However, the current situation of germplasm collections is characterized by the existence of unidentified duplicates among collections, taxonomic mislabelling, insufficient and unavailable documentation for researchers and breeders, and a lack of funding for proper conservation and management. This greatly hampers the utilization of these resources. This thesis addresses this problem by starting with the unification of passport, phenotyping, and image data from the main collections of tomato, pepper, and eggplant. Genotyping these collections enables the creation of core collections, enhancing knowledge of genotypic and phenotypic variability for researchers and breeders.
In the first chapter, an inventory of available passport and phenotypic data of tomato, pepper, and eggplant accessions conserved in major European and non-European germplasm banks were conducted to improve the efficiency of plant genetic resource management.
The second chapter focuses on the development and optimization of a high-quality, fast, and cost-effective genomic DNA extraction method that combines the advantages of the CTAB-based extraction method with nucleic acid purification on a silica matrix. The efficiency of the resulting genomic DNA was evaluated on different sequencing platforms, such as Single Primer Enrichment Technology (SPET) and Oxford Nanopore, yielding promising results. This facilitates the prerequisite step of DNA extraction before genotyping the collections.
Chapter three addresses the genotyping of the collections. The high number of accessions for each crop, particularly tomato, poses an often insurmountable economic problem. Therefore, chapter three is focused on evaluating the potential of SPET sequencing technology, which is more cost-effective than other known methods, for high-throughput genotyping of tomato and eggplant germplasm collections. The results reveal that SPET genotyping is a robust and high-performance technology for genetic studies, including the identification of duplicates and taxonomic misclassifications in the accessions stored in the germplasm banks. Based on the information generated in the first three chapters, core collections were established for each crop, encompassing maximum genetic and phenotypic diversity in a set of 450 individuals.
Finally, in the fourth chapter, the genetic and phenotypic analysis of the tomato core collection is examined and described using an approach based on establishing genetic groups based on their genetic proximity. Genetic and phenotypic diversity analysis revealed distinct patterns of variation among different genetic groups, contradicting previous claims of a decrease in genetic diversity due to genetic improvement and uncovering unique correlations between morphological traits within different groups. The study highlights the importance of considering both genetic and phenotypic diversity in tomato breeding initiatives, with a particular emphasis on aspects such as fruit size, shape, color, and quality.
In conclusion, this thesis enhances knowledge and accessibility to major Solanaceae collections in germplasm banks, while providing molecular tools for genotypic evaluation. It underscores germplasm banks' role as genetic diversity reservoirs, despite challenges such as data limitations and inaccuracies, emphasizing the importance of data standardization and maintenance. These advancements lay a foundation for conservation and breeding programs in the future. / This work was supported by grants CIPROM/2021/020 from Conselleria d’Innovació, Universitats, Ciència i Societat Digital (Generalitat Valenciana, Spain), PID2021-128148OB-I00 funded by MCIN/AEI/10.13039/501100011033/ and by “ERDF A way of making Europe”, PDC2022-133513-I00 funded by MCIN/AEI/10.13039/501100011033/, and by “European Union NextGenerationEU/PRTR”, by Grant Agreement No. 677379 (G2P-SOL project: Linking genetic resources, genomes and phenotypes of Solanaceous crops) from European Union’s Horizon 2020 Research and Innovation Programme, by the Grant Agreement No. 101094738 (PRO-GRACE project: Promoting a Plant Genetic Resource Community for Europe) from the European Union’s Horizon Europe programme, as well as by the initiative "Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives", which is supported by the Government of Norway. This later project is managed by the Global Crop Diversity Trust with the Millennium Seed Bank of the Royal Botanic Gardens, Kew and implemented in partnership with national and international gene banks and plant breeding institutes around the world. For further information, see the project website: htp://www.cwrdiversity.org/. The overall work also partially fulfils some goals of the Agritech National Research Center and received funding from the European Union Next-Generation EU (PIANO NAZIONALE DI RIPRESA E RESILIENZA (PNRR)–MISSIONE 4 COMPONENTE 2, INVESTIMENTO 1.4—D.D. 1032 17/06/2022, CN00000022). David Alonso is grateful to Universitat Politècnica de València for a predoctoral (PAID-01-16) / Alonso Martín, D. (2023). Linking Genetic Resources, Genomes and Phenotypes of Solanaceus Crops [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/201550
|
Page generated in 0.0306 seconds