1 |
Design of the Multimedia Processor Based on MMX Instruction SetHong, Shou-xi 26 July 2007 (has links)
Today the application of the embedded system is more complex. Especially the multimedia function is most popular. But it is still difficult to work smooth on the embedded systems. However, there are some solutions to solve this problem, like DSP and some specific codec chips. But these methods are almost outside of the embedded microprocessor. Here we advance a new architecture, Multimedia Operation Register. We use the bit slice concept to design operation pair which combining bit storage cell and bit computation. Sixty four operation pairs form a MOSU¡]Multimedia Operation Storage Unit¡^. One MOSU could execute all multimedia instructions. We using multiple MOSUs and three register addressing modes to achieve optimal SIMD. The number of MOSUs in Multimedia Operation Register could be determined flexibly by different kinds of operation efficiency requirement.
On the other hand we design new instruction set based on the Intel MMX instruction set and the operation feature of H.26x video codec series. According to the simulation in 6th chapter, new instruction set is more efficient than the Intel MMX instruction set, and the Multimedia Operation Register architecture compared with C64 DSP could obtain 105% performance enhancement.
|
2 |
Implementation of face detection algorithm with parallel extended-MMX instruction setTzeng, Hua-Yi 20 August 2008 (has links)
Face detection has many applications in technical area. We think about accuracy and regular arrangement of data of face detection. So, we select Recognition algorithms using neural network for implementation. The implementation method can be divided into three parts. One is Modified Census Transform. The other one is computing hypotheses. Other is square frame for mark face. Modified Census Transform is a regularly computing method and regular arrangement of data. Modified Census Transform is compatible using SIMD execution, but other parts is irregular arrangement of data and not easy to parallel execution. This paper uses SIMD processor architecture which develops in our laboratory to implementation of Modified Census Transform and multi-data streaming property. The picture is divided four parts to execute at the same time and changes different mode to execute according to different algorithm then fetch data is smooth and moving data can reduce frequency. Adding a new instruction that uses 16bits data format uses four MMX registers for 4¡Ñ4 transpose of the matrix. The other is loading data and extending signed bit or unsigned bit at the same time. They can accelerate parallel execution in multi-data streaming. We also support multi-data streaming that is not series. It uses striping mode to fetch multi-data which between the same distance then we can achieve to compute multi-data streaming. Besides, we use hypotheses to distinguish different person that we only want find one. We compare two hypotheses. If the difference in hypotheses between two different picture that there is small than 0.3%, they are the same person which in different picture. Finial, we verify the function is correct in UMVP-2500 platform. We compare efficiency with MMX and Xscale and analysis multi-data streaming SIMD architecture which has some benefits. We compare efficiency with MMX. We speed up 373%. We compare efficiency with Xscale. We speed up 345%. This result will show that multi-data streaming SIMD architecture compares speed up with others SIMD architecture. Multi-data streaming SIMD architecture adds a new instruction which is 4¡Ñ4 transpose of the matrix. Because the 4¡Ñ4 transpose of the matrix can change row and column, we have new abstraction. The common computation likes a line, but the new abstraction becomes a phase. MMX and Xscale are not this abstraction.
|
3 |
Efficient Implementation of Morphological Image Processing on Pentium MachinesChen, Jau-Liang 06 August 2001 (has links)
Morphological image processing is especially useful in the applications of medical image processing, pattern recognition, and industry auto-inspection. Special hardware for morphological image processing are very expensive. On the other hand, the speed of software are too slow. The purpose of this paper is to speed up the software computations of morphological image processing by parallel processing on Pentium machine.
The morphological operation is similar to digital convolution. We can realize our parallel morphological operation on the Pentium machine by two different methods. They are output-decomposition and input-decomposition methods, similar to the procedure of overlap-and-save and overlap-and-add respectively. The above methods implemented on Pentium machine are proved very efficient with 64-bits parallelism. Our experimental results demonstrated they are twice faster than the 32-bits parallelism method. In addition to the simulation and the real time experiments, a set of theoretical formulas are derived to analyze our methods and are checking the actual measured time quite well.
|
4 |
The implementation of H.264 algorithm with parallel extended MMX instruction setShen, Cheng-Ying 20 August 2008 (has links)
The H.264 Protocol is an important method for the multimedia transmission and calculation, but it is difficult to work smoothly on the embedded systems because of the low clock in the working environment of the embedded system .Although many new multimedia instruction sets have been developed, the immediate multimedia calculation is still difficult to implement on the embedded system.
So this paper uses the ¡§Multimedia Operation Register¡¨, a SIMD architecture, to implement H.264 algorithm on the embedded system to improve the performance of handling multimedia calculation. Multimedia Operation Register, which performs the parallel execution of the multi-data-streaming, uses the bit slice concept to design operation pair combining bit storage cell and bit computation. According to the characteristic , which is the address having constant distance between more than two data being used saved in the Memory, this paper using the striping addressing mode , which can cooperate with the parallel execution of multi-data-streaming , to load the data having strode addresses from the Memory in one instructions. On the other hand, this paper designs a new instruction set based on the Intel MMX instruction set and the operation feature of multimedia calculation.
When a designer uses single-data-steaming to implement the H.264 Protocol by the multimedia instruction sets, he will use more interactions to do the same thing in every block. Now this paper can use fewer interactions to do the same thing because the Multimedia Operation Register can use the parallel execution of the multi-data-stream to calculate the data in many different blocks to implement H.264 Protocol at the same time. On the other hand, this paper can reallocate the number of the registers to the arithmetic unit which will be used smartly by changing the working mode. This paper also saves much execution time of some actions such as the transpose of the matrix, the data resorting and the SAD (Sum of Absolute Differences) calculation by using new instructions. In order to reduce the times of memory access, this paper uses the method which rotates the data between two registers to let the data been used as possible as it can. So the coding efficiency can be improved explosively by using all the methods which have been introduced.
The conclusion in this paper shows that the parallel execution of the multi-data-streaming will be a very important method to handle multimedia calculation. And this paper advances an innovative architecture to implement the parallel execution of the multi-data- streaming. According to the simulation in 5th chapter, the speedup of handling H.264 Protocol by Multimedia Operation Register is more than four times with MMX instruction set. In the SAD calculation, it even can have ten times advanced then MMX instruction set. At last the efficacy is even better than the latest multimedia instruction set -¡§SSE4¡¨.
|
5 |
Implementation of Action Recognition Algorithm on Multiple-Streaming Multimedia UnitLin, Tzu-chun 03 August 2010 (has links)
Action recognition had become prosperous in development and been broadly applied in several sectors. From homeland security, personal property, home caring, even the smart environment and the motion-sensing games, are in its territories
This paper analysis the algorithm of Action recognition for embedded system, finds that there are many blocks can use the parallel execution to compute more efficiently. This paper tries to implement action recognition algorithm on Multiple-Streaming Multimedia Unit (MSMU). MSMU is a MMX-like SIMD architecture, with SIMD Operation and Data Storage. By introduction the concept of multiple streaming, MSMU will be able to modulate the amount of parallel data streams dynamically via switching the instruction mode. With Mode Switching and new added transfer instruction to compute 2D image processing, study the benefit of the instruction mode switching
Through comparing the 128-bit SSE architecture and MSMU architecture with the practical example, highlight the problems that exploiting the subword parallelisms facing and bring out the advantage of Multistreaming.
For the algorithm, study the slicing the minimum element and using the bitwise operation approach to better efficiency. Compare to embedded SIMD architecture "WMMX", MSMU can achieve 3.49¡Ñ overall speedup.
|
6 |
An MMX Study of Benzene Isomers and the Hydrogenation Products of BenzeneZuo, Tianming, Huang, Thomas 01 March 2004 (has links)
We have calculated the structures, the heats of hydrogenation and the resonance energies of benzene isomers. All structures and energies were calculated by using the MMX force fields. Using PC model, the calculated structure parameters of benzene's 8 isomers are generally in good agreements with the experimental data. The heats of hydrogenation and resonance energy of benzene isomers are parallel to those experimental data and need a. systematic adjustment of 4.5 kJ/mol.
|
7 |
Study of the audio coding algorithm of the MPEG-4 AAC standard and comparison among implementations of modules of the algorithmHoffmann, Gustavo André January 2002 (has links)
Audio coding is used to compress digital audio signals, thereby reducing the amount of bits needed to transmit or to store an audio signal. This is useful when network bandwidth or storage capacity is very limited. Audio compression algorithms are based on an encoding and decoding process. In the encoding step, the uncompressed audio signal is transformed into a coded representation, thereby compressing the audio signal. Thereafter, the coded audio signal eventually needs to be restored (e.g. for playing back) through decoding of the coded audio signal. The decoder receives the bitstream and reconverts it into an uncompressed signal. ISO-MPEG is a standard for high-quality, low bit-rate video and audio coding. The audio part of the standard is composed by algorithms for high-quality low-bit-rate audio coding, i.e. algorithms that reduce the original bit-rate, while guaranteeing high quality of the audio signal. The audio coding algorithms consists of MPEG-1 (with three different layers), MPEG-2, MPEG-2 AAC, and MPEG-4. This work presents a study of the MPEG-4 AAC audio coding algorithm. Besides, it presents the implementation of the AAC algorithm on different platforms, and comparisons among implementations. The implementations are in C language, in Assembly of Intel Pentium, in C-language using DSP processor, and in HDL. Since each implementation has its own application niche, each one is valid as a final solution. Moreover, another purpose of this work is the comparison among these implementations, considering estimated costs, execution time, and advantages and disadvantages of each one.
|
8 |
Study of the audio coding algorithm of the MPEG-4 AAC standard and comparison among implementations of modules of the algorithmHoffmann, Gustavo André January 2002 (has links)
Audio coding is used to compress digital audio signals, thereby reducing the amount of bits needed to transmit or to store an audio signal. This is useful when network bandwidth or storage capacity is very limited. Audio compression algorithms are based on an encoding and decoding process. In the encoding step, the uncompressed audio signal is transformed into a coded representation, thereby compressing the audio signal. Thereafter, the coded audio signal eventually needs to be restored (e.g. for playing back) through decoding of the coded audio signal. The decoder receives the bitstream and reconverts it into an uncompressed signal. ISO-MPEG is a standard for high-quality, low bit-rate video and audio coding. The audio part of the standard is composed by algorithms for high-quality low-bit-rate audio coding, i.e. algorithms that reduce the original bit-rate, while guaranteeing high quality of the audio signal. The audio coding algorithms consists of MPEG-1 (with three different layers), MPEG-2, MPEG-2 AAC, and MPEG-4. This work presents a study of the MPEG-4 AAC audio coding algorithm. Besides, it presents the implementation of the AAC algorithm on different platforms, and comparisons among implementations. The implementations are in C language, in Assembly of Intel Pentium, in C-language using DSP processor, and in HDL. Since each implementation has its own application niche, each one is valid as a final solution. Moreover, another purpose of this work is the comparison among these implementations, considering estimated costs, execution time, and advantages and disadvantages of each one.
|
9 |
Study of the audio coding algorithm of the MPEG-4 AAC standard and comparison among implementations of modules of the algorithmHoffmann, Gustavo André January 2002 (has links)
Audio coding is used to compress digital audio signals, thereby reducing the amount of bits needed to transmit or to store an audio signal. This is useful when network bandwidth or storage capacity is very limited. Audio compression algorithms are based on an encoding and decoding process. In the encoding step, the uncompressed audio signal is transformed into a coded representation, thereby compressing the audio signal. Thereafter, the coded audio signal eventually needs to be restored (e.g. for playing back) through decoding of the coded audio signal. The decoder receives the bitstream and reconverts it into an uncompressed signal. ISO-MPEG is a standard for high-quality, low bit-rate video and audio coding. The audio part of the standard is composed by algorithms for high-quality low-bit-rate audio coding, i.e. algorithms that reduce the original bit-rate, while guaranteeing high quality of the audio signal. The audio coding algorithms consists of MPEG-1 (with three different layers), MPEG-2, MPEG-2 AAC, and MPEG-4. This work presents a study of the MPEG-4 AAC audio coding algorithm. Besides, it presents the implementation of the AAC algorithm on different platforms, and comparisons among implementations. The implementations are in C language, in Assembly of Intel Pentium, in C-language using DSP processor, and in HDL. Since each implementation has its own application niche, each one is valid as a final solution. Moreover, another purpose of this work is the comparison among these implementations, considering estimated costs, execution time, and advantages and disadvantages of each one.
|
10 |
Potencial terapêutico de sistemas matriciais do ácido 5-aminossalicílico no tratamento de doenças inflamatórias intestinais: revisão sistemática / Therapeutic potential of 5-aminossalicylic acid matrix systems in the treatment of inflammatory bowel diseasesSimoni, Suelen Eloise 09 March 2018 (has links)
Submitted by Rosangela Silva (rosangela.silva3@unioeste.br) on 2018-05-14T14:55:58Z
No. of bitstreams: 2
Suelen Eloise Simoni.pdf: 965660 bytes, checksum: 4e663d56433a3c0d625376f602d694d5 (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-05-14T14:55:58Z (GMT). No. of bitstreams: 2
Suelen Eloise Simoni.pdf: 965660 bytes, checksum: 4e663d56433a3c0d625376f602d694d5 (MD5)
license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Previous issue date: 2018-03-09 / The development of new technologies for the treatment of inflammatory bowel diseases, such as 5-aminosalicylic acid (5-ASA) multi matrix, becomes increasingly important because they are alternatives in unsuccessful treatments or even influence adaptation to a simpler dosing regimen. However, there is no clear conclusion as to its superiority or efficacy in modified release coated tablets, which are applied in clinical treatment protocols, nor on their safety and tolerability. For this reason, the objective of this dissertation was to perform a systematic review, searching through predefined keywords in database (Pubmed, Science Direct, Embase, Scopus, Web of Science and Clinical Trials) and also by research manual, from March 2016 to December 2017, studies comparing matrix technology with delayed release coated tablets in order to gather evidence that showed the equivalence or superiority of efficacy of this formulation in the parameters of clinical and endoscopic remission in patients suffering from of ulcerative colitis. In all, three studies fit into the inclusion criteria and were added to this study, all of them being a multicenter, double-blind, randomized methodological design, with matrix tablet concentrations varying from 1.2 g to 4.8 g daily, while the dose range of the coated tablets ranged from 800 mg to 2.4 g divided in two to three administrations daily. In summary, the efficacy of the two dosage forms is extremely similar in endoscopic remission parameters, however, when clinical remission is considered, the results indicate that better indices are achieved in the use of matrix technology, regardless of the concentration used. / O desenvolvimento de novas tecnologias para o tratamento de doenças inflamatórias intestinais, como o ácido 5-aminossalicílico (5-ASA) em sua forma matricial torna-se cada vez mais importante, porque surgem como alternativas em tratamentos sem sucesso ou ainda, influenciam na adaptação a um esquema posológico mais simples. No entanto, não se tem uma conclusão clara sobre a sua superioridade ou equiparação de eficácia aos comprimidos revestidos de liberação modificada, os quais são aplicados em protocolos clínicos de tratamento, nem tampouco sobre a sua segurança e tolerabilidade. Por essa razão, o objetivo dessa dissertação foi realizar uma revisão sistemática, buscando-se através de palavras-chave pré-definidas em base de dados (Pubmed, Science Direct, Embase, Scopus, Web Of Science e Clinical Trials) e também por pesquisa manual, de março de 2016 a dezembro de 2017, trabalhos que comparassem a tecnologia matricial com comprimidos revestidos de liberação retardada, a fim de reunir evidências que demonstrassem a equiparação ou superioridade de eficácia dessa formulação nos parâmetros de remissão clínica e endoscópica nos pacientes que sofrem de colite ulcerativa. No total, três estudos se enquadraram nos critérios de inclusão e foram adicionados a esse trabalho, sendo todos eles de desenho metodológico multicêntrico, duplo-cego e randomizado, tendo as concentrações dos comprimidos matriciais variando de 1,2 g a 4,8 g diários, enquanto que a variação das doses dos comprimidos revestidos variou de 800 mg a 2,4 g divididos em duas a três administrações diárias. Em síntese, a eficácia das duas formas farmacêuticas mostra-se extremamente semelhante nos parâmetros de remissão endoscópica, no entanto, quando se considera a remissão clínica os resultados apontam que melhores índices são alcançados na utilização da tecnologia matricial, independente da concentração utilizada do medicamento.
|
Page generated in 0.0313 seconds