191 |
Integrating Sequence and Structure for Annotating Proteins in the Twilight Zone: A Machine Learning ApproachIsye Arieshanti Unknown Date (has links)
Determining protein structure and function experimentally is both costly and time consuming. Transferring function-related protein annotations based on homology-based methods is relatively straightforward for proteins that have sequence identity of more than 40%. However, there are many proteins in the "twilight zone" where sequence similarity with any other protein is very weak, while being structurally similar to several. Such cases require methods that are capable of using and exploiting both sequence and structural similarity. To understand ways of how such methods can and should be designed is the focus of this study. In this thesis, models that use both sequence and structure features are applied on two protein prediction problems that are particularly challenging when relying on sequence alone. Enzyme classification benefits from both kinds of features because on one hand, enzymes can have identical function with limited sequence similarity while on the other hand, proteins with similar fold may have disparate enzyme class annotation. This thesis shows that the full integration of protein sequence and structure-related features (via the use of kernels) automatically places proteins with similar biological properties closer together, leading to superior classification accuracy using Support Vector Machines. Disulfide-bonds link residues in a protein structure, but may appear distant in sequence. Sequence similarity reflecting such structural properties is thus very hard to detect. It is sufficient for the structure to be similar for accurate prediction of disulfide-bonds, but such information is very scarce and predictors that rely on protein structure are not nearly as useful as those operating on sequence alone. This thesis proposes a novel approach based on Kernel Canonical Correlation Analysis that uses structural features during training only. It does so by finding sequence representations that correlate with structural features that are essential for a disulfide bond. The resulting representations enable high prediction accuracy for a range of disulfide-bond problems. The proposed model thus taps the advantage of structural features without requiring protein structure to be available in the prediction process. The merits of this approach should apply to a number of open protein structure prediction problems.
|
192 |
Informative correlation extraction from and for Forex market analysisLei, Song January 2010 (has links)
The forex market is a complex, evolving, and a non-linear dynamical system, and its forecast is difficult due to high data intensity, noise/outliers, unstructured data and high degree of uncertainty. However, the exchange rate of a currency is often found surprisingly similar to the history or the variation of an alternative currency, which implies that correlation knowledge is valuable for forex market trend analysis. In this research, we propose a computational correlation analysis for the intelligent correlation extraction from all available economic data. The proposed correlation is a synthesis of channel and weighted Pearson's correlation, where the channel correlation traces the trend similarity of time series, and the weighted Pearson's correlation filters noise in correlation extraction. In the forex market analysis, we consider 3 particular aspects of correlation knowledge: (1) historical correlation, correlation to previous market data; (2) cross-currency correlation, correlation to relevant currencies, and (3) macro correlation, correlation to macroeconomic variables. While evaluating the validity of extracted correlation knowledge, we conduct a comparison of Support Vector Regression (SVR) against the correlation aided SVR (cSVR) for forex time series prediction, where correlation in addition to the observed forex time series data is used for the training of SVR. The experiments are carried out on 5 futures contracts (NZD/AUD, NZD/EUD, NZD/GBP, NZD/JPY and NZD/USD) within the period from January 2007 to December 2008. The comparison results show that the proposed correlation is computationally significant for forex market analysis in that the cSVR is performing consistently better than purely SVR on all 5 contracts exchange rate prediction, in terms of error functions MSE, RMSE, NMSE, MAE and MAPE. However, the cSVR prediction is found occasionally differing significantly from the actual price, which suggests that despite the significance of the proposed correlation, how to use correlation knowledge for market trend analysis remains a very challenging difficulty that prevents in practice further understanding of the forex market. In addition, the selection of macroeconomic factors and the determination of time period for analysis are two computationally essential points worth addressing further for future forex market correlation analysis.
|
193 |
Informative correlation extraction from and for Forex market analysisLei, Song January 2010 (has links)
The forex market is a complex, evolving, and a non-linear dynamical system, and its forecast is difficult due to high data intensity, noise/outliers, unstructured data and high degree of uncertainty. However, the exchange rate of a currency is often found surprisingly similar to the history or the variation of an alternative currency, which implies that correlation knowledge is valuable for forex market trend analysis. In this research, we propose a computational correlation analysis for the intelligent correlation extraction from all available economic data. The proposed correlation is a synthesis of channel and weighted Pearson's correlation, where the channel correlation traces the trend similarity of time series, and the weighted Pearson's correlation filters noise in correlation extraction. In the forex market analysis, we consider 3 particular aspects of correlation knowledge: (1) historical correlation, correlation to previous market data; (2) cross-currency correlation, correlation to relevant currencies, and (3) macro correlation, correlation to macroeconomic variables. While evaluating the validity of extracted correlation knowledge, we conduct a comparison of Support Vector Regression (SVR) against the correlation aided SVR (cSVR) for forex time series prediction, where correlation in addition to the observed forex time series data is used for the training of SVR. The experiments are carried out on 5 futures contracts (NZD/AUD, NZD/EUD, NZD/GBP, NZD/JPY and NZD/USD) within the period from January 2007 to December 2008. The comparison results show that the proposed correlation is computationally significant for forex market analysis in that the cSVR is performing consistently better than purely SVR on all 5 contracts exchange rate prediction, in terms of error functions MSE, RMSE, NMSE, MAE and MAPE. However, the cSVR prediction is found occasionally differing significantly from the actual price, which suggests that despite the significance of the proposed correlation, how to use correlation knowledge for market trend analysis remains a very challenging difficulty that prevents in practice further understanding of the forex market. In addition, the selection of macroeconomic factors and the determination of time period for analysis are two computationally essential points worth addressing further for future forex market correlation analysis.
|
194 |
A Study of Southern Spectroscopic BinariesThompson, Vincent Brent January 2009 (has links)
The study of spectroscopic binaries is by no means a new area of study. The Doppler shifting of spectral lines as the stars orbit around each other is now able to be measured very precisely. Binary stars give a reliable means of determining stellar parameters such as the mass. A star's mass is one of the most dominant factors in determining its evolution. Stars for study in this thesis were selected from SB9 (the ninth catalogue of spectroscopic binaries). They were chosen on criteria such as apparent visual magnitude, orbital period, orbital solution grade, equatorial velocity and position. Only stars with poor to average orbital solutions were chosen as it is these orbits which need the most work done. In total 6 spectroscopic binary systems were chosen for study in this thesis. Four single lined spectroscopic binaries (HD 70958, HD 110318, HD 122223 and HD 141544) and two double line spectroscopic binaries (HD 110317 and HD 148704). Unfortunate observing conditions meant that adequate phase coverage of HD 110317 and HD 110318 was not achieved. Adequate phase coverage of the star HD 122223 was also not achieved but this is likely a result of the period being about three years and not about 207 days as quoted in the catalogue. Observations were carried out with the HERCULES spectrograph and the 1-metre McLellan telescope at the Mt John University Observatory from December 2007 until September 2008. Radial velocities were than measured from these spectra with HRSP3 and then orbital solutions were derived. Orbital solutions have been derived for the single-lined systems HD 141544 and HD 70958. The precision of HD 141544 was much better than HD 70598. This is because HD 70958 is complicated by differential rotation and possible chromospheric activity. The orbital solution of the double lined system HD 148704 was obtained by using CARTopt and not TODCOR as is common, with good results. HD 122223 is included even though only six spectra were obtained as it will be evident that the current orbital solution should be rejected in favour of the previous solution obtained in 1936 by Christie. Although the amount of data was not as large as was hoped, significant improvements of the orbital solutions were obtained. The secondary component of HD 148704 had only previously being detected in a very few spectra but now has a good orbital solution. Errors on all parameters have been decreased and tighter limits have been placed on the secondary components of the single lined systems. The mass ratio of the components of HD 148704 was also determined very accurately and calculation of the inclination from photometry may allow accurate masses to be determined.
|
195 |
To what extent will the annual number of episodes of acute confusion within a medical unit be reduced following the introduction of high risk indicators and early intervention strategiesMoloney, Clint January 2005 (has links)
This simple quantitative descriptive case controlled research compared cases (subjects at risk for acute confusion) with controls (subjects without the attribute); comparison was made on the exposure to potential contributing factors suspected of causing acute confusion, for example, heavy smoking, or the number of alcoholic drinks consumed per day. Case-control studies were also retrospective, because they focused on conditions in the past that might have caused subjects to become cases, rather than controls. The basic purpose of this research design was essentially the same as that of experimental research: to determine the relationships among variables. This report demonstrates that, with relatively good adherence by the nursing team, proactive screening using a structured risk assessment protocol can be successfully implemented for medical patients. This assessment was associated with a statistically significant 50 per cent reduction in the incidence of acute confusion in the intervention group, compared with usual care retrospectively. Reduction in acute confusion was not associated with shortened length of stay, but length of stay was often predetermined by protocol or critical pathway. Correlation analysis demonstrated that risk screening appeared most effective in preventing or reducing acute confusion in patients without preadmission dementia or ADL impairment. In patients with significant preadmission impairment, the stress of hospitalisation may be sufficient to precipitate an episode, despite otherwise optimal management. Less-impaired patients may require additional insults to precipitate acute confusion, some of which are avertable by risk screening and subsequent early intervention. Determined risk indicators were consistent throughout the four year timeframe set for this research project. This demonstrated that although there were multiple patient types presenting to this clinical area, they were consistently the same over a longitudinal timeframe. It meant they were reproducible, which gave this research additional strength. Also, based on the descriptive statistics, this research has shown that in this clinical area where intervention was introduced the combination did have a positive impact on annual numbers of acute confusion. In summary, these findings suggest that without risk screening and the direction for appropriate management the likelihood of an episode can more than double. In the three subgroups expected to pose the greatest challenges for the risk assessment (i.e. those 70 years or older, those with suspected drug dependency, and those with symptomatic infection), risk assessment retained excellent sensitivity, (a) (d) specificity, and relevant correlation with reduction of episodes. This research has demonstrated throughout that high risk screening and associated intervention based on the risk indicator can decrease the annual number of actual episodes of acute confusion. Interventions to prevent or reduce an episode of acute confusion, as outlined by Wakefield (2002) and this research, definitely increases as a result of high risk screening. Beyond doubt, from both the literature reviewed and the findings of this research, is that risk screening does need to be adapted to the individual clinical setting and cannot be generic.
|
196 |
Novel approaches for application of principal component analysis on dynamic PET images for improvement of image quality and clinical diagnosis /Razifar, Pasha, January 2005 (has links)
Diss. (sammanfattning) Uppsala : Uppsala universitet, 2005. / Härtill 6 uppsatser.
|
197 |
Efficient correlated pattern discovery in databases /Ke, Yiping. January 2008 (has links)
Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2008. / Includes bibliographical references (leaves 112-120). Also available in electronic version.
|
198 |
New results in dimension reduction and model selectionSmith, Andrew Korb. January 2008 (has links)
Thesis (Ph. D.)--Industrial and Systems Engineering, Georgia Institute of Technology, 2008. / Committee Chair: Huo, Xiaoming; Committee Member: Serban, Nicoleta; Committee Member: Shapiro, Alexander; Committee Member: Yuan, Ming; Committee Member: Zha, Hongyuan.
|
199 |
Comparisons of correlation methods in risk analysis /Moore, Julie Carolyn. January 1994 (has links)
Thesis (M.S.)--Virginia Polytechnic Institute and State University, 1994. / Vita. Abstract. Includes bibliographical references (leaves 46-47). Also available via the Internet.
|
200 |
Statistical analysis and validation procedures under the common random number correlation induction strategy for multipopulation simulation experiments /Joshi, Shirish, January 1991 (has links)
Thesis (M.S.)--Virginia Polytechnic Institute and State University, 1991. / Vita. Abstract. Includes bibliographical references (leaves 71-72). Also available via the Internet.
|
Page generated in 0.091 seconds