|
|
|
Research |
|
|
| The Multimedia Communications
Laboratory conducts research aimed at the intersection of Content processing, Computing and Communications.
In addressing this rich opportunity, we emphasize academic
rigor, industry guidance and a pragmatic view for creating impact. |
|
|
|
|
| Content Processing |
|
| "Mixed Initiative Multimedia for Mobile Devices: Design of a Semantically Relevant & Low Latency System for News Video Recommendations" Abstract by Jeannie Lee | |
Mobile devices have inherent resource constraints such as limited network bandwidth and small screen size. To facilitate access to news video on mobile devices, a coordinated design approach is taken, considering various system perspectives. The goal is to provide a cognitively palatable stream of videos and a seamless and low latency user experience through the use of an adaptive mixed-initiative interface to solicit user relevance feedback, and content retrieval integrated with client-side video buffering and pre-fetching. These various components are otherwise usually considered independently in the system design. The experiments suggest that this approach is helpful for recommending news video content on a mobile device, and areas for future investigation are outlined. |
|
| "Image Compression to Enhance Clinical Diagnosis and Workflow in Telepathology" Abstract by Saunya Williams | |
Telepathology consists of a digital environment used for managing, interpreting, sharing, and transmitting pathological information to a remote site via a telecommunications link. While the technology for telepathology has been available for several years, it has yet to be fully adopted as commonplace amongst pathologist. Several challenges hinder the widespread application of telepathology, such as diagnostic accuracy, patient data security, medical liability, and image quality. The size of a digitized glass slide specimen is quite significant (up to 25 Megapixels). Hence, the importance of image compression. Typically, JPEG compresses 30:1 to 50:1 with defects varying from small to moderate. The peak signal-to-noise ratio (PSNR) is commonly used in image processing to measure the quality of two images. Without the presence of a clear standard for digital imaging within telepathology, image quality lacks a metric in terms of diagnostic losslessness and remains vulnerable to subjectivity. This project includes a variety of digital slides provided by Emory University that are compressed using JPEG. The research on image compression will facilitate the development of criteria to ensure diagnostic accuracy. These findings will be invaluable in helping to increase the application of telepathology and the establishment of a telepathology network within the Emory healthcare system. |
|
| "Objective Measurement of Transcoded Video Quality in Mobile Applications" Abstract by Ramanathan Palaniappan | |
As wireless standards evolve to create a globally applicable third generation (3G) mobile phone system specification, there is significant interest in the user perceived quality of the multimedia services supported in mobile applications. Hence, the long standing practice of assessing video quality by conventional reference based objective measurements needs to be replaced by more ubiquitous, zero reference measurements. Such metrics will remove the need to access the original reference video to assess quality since this reflects the most practical scenarios in the end-to-end video distribution chain. In addition, these zero-reference metrics need to have a high degree of correlation to subjective measurements. One example of such a technique is the AVQ meter, developed at Georgia Tech and VQlink which has been shown to provide accurate estimates of subjectively evaluated Mean Time Between Failures (visible artifacts). |
|
| "Telepathology Research - Subjective Evaluation and Comparison of Compressed Image Quality" Abstract by Sourabh Khire | |
| JPEG 2000 and JPEG are ISO/ITU-T standards for still image coding. Both these standards support lossy compression, i.e. they accept some loss of information in order to achieve higher compression. The effect of such lossy compression on the quality of the original image can be quantified using simple measures such as Mean Square error (MSE) or Peak Signal to Noise ratio (PSNR). However, PSNR and MSE do not always relate well with the visual quality of the compressed images. So it is of great value to gather information about the fidelity of compressed images by conducting experiments for subjective evaluation of image quality. One such subjective test was conducted at MMC to evaluate and compare images compressed using the JPEG and the JPEG 2000 algorithms. The image database for the test consisted of medical and non-medical image compressed using JPEG and JPEG 2000 at bitrates between 0.2 bits per pixel (bpp) to 1 bpp. The results indicate that visually, JPEG 2000 holds some advantage over the JPEG algorithms at lower bitrates (less than 0.6 bpp), but this perceptual difference between them tends to vanish at higher bitrates. | |
| "Delay Bound Rich Image Delivery over WLANs" Abstract by Shira Krishnan | |
| Today's globally distributed teams that seek the convenience of wireless workspace have created an increased need for collaboration through image transmission. These scenarios require both acceptable image quality and transmission delay, while using networks that carry other services, such as voice, data and video as well. It is especially challenging when High-Definition images requiring lossless transmissions are encountered. In this work, an attempt is made to identify wireless network systems that is most suited for rich media delivery within set interactive timeframes using a network designed to handle other forms of traffic as well. Traffic of interest is High-Density Images that have a delay bound acceptable to the users. Three different sizes of images with three degrees of compression - Raw format, Mathematically lossless compressed form and Diagnostically lossless compressed form are studied while making recommendations. | |
| "Mean Time Between Visible Artifacts in Visual Communications" Abstract by Nitin Suresh | |
| As digital communication of television content becomes more pervasive, and as networks supporting such communication become increasingly diverse, the long-standing problem of assessing video quality by objective measurements becomes particularly important. Content owners as well as content distributors stand to benefit from rapid objective measurements that correlate well with subjective assessments, and further, do not depend on the availability of the original reference video. This thesis investigates different techniques of subjective and objective video evaluation. Our research recommends a functional quality metric called Mean Time Between Failures (MTBF) [1] where failure refers to video artifacts deemed to be perceptually noticeable, and investigates objective measurements that correlate well with subjective evaluations of MTBF. In this work, the subjective tests for evaluating MTBF involve different video clips from the Video Quality Experts Group (VQEG [2]) encoded in MPEG-2 format at bit rates in the range of 1.5 - 5 Mbps, and subject to packet losses in the range of 0.1 – 2.0 %. Each of the test clips is 140 seconds in length, and a diverse viewer pool of 30 subjects was used. Work has been done for determining the usefulness of some existing objective metric by noting their correlation with MTBF. The metrics studied include full-reference, reduced-reference and noreference objective metrics: PSNR, Just Noticeable Difference metric (JND) [3], Spatial Temporal Join Metric (STJM) [4] and Blockiness metric (BLK) [5]. The research also includes experimentation with network-induced artifacts, and a study on statistical methods for correlating candidate objective measurements with the subjective metric [6]. The statistical significance and spread properties for the correlations are studied, and a comparison of subjective MTBF with the existing subjective measure of MOS is performed. These results suggest that MTBF has a direct and predictable relationship with MOS, and that they have similar variations across different viewers, when computed over any clip The research is particularly concerned with the development of new no-reference objective metrics that are easy to compute in real time, as well as correlate better than current metrics with the intuitively appealing MTBF measure. The approach to obtaining greater subjective relevance has included the study of better spatial-temporal models for noise-masking and test data pooling in video perception. A new objective metric, 'Automatic Video Quality' metric (AVQ) [6] is described and shown to be implemented in real time with a high degree of correlation with actual subjective MTBF scores, with the correlation values approaching the correlations of metrics that use full or partial reference. This is metric does not need any reference to the original video, and when used to display MPEG2 streams, calculates and indicates the video quality in terms of MTBF. Certain diagnostics like the amount of compression and network artifacts are also shown. |
|
"Elastic Algorithms for Region of Interest Video Compression, with Applications to Mobile Telehealth" Abstract by Sira Rao |
|
Video is the most demanding modality from the viewpoints of bandwidth, computational complexity, and resolution. Thus, there has been limited progress in the field of mobile video technology. In the research, the focus is on elastic wireless video technology, and its adaptation to diagnostic application requirements in real-time clinical assessment. It is important and timely to apply wireless video technology to real-time remote diagnosis of emergent medical events. This premise comes from initial successes in telehealth based on wired networks. The enablement of mobility (for the physician and/or the patient) by wireless communication will be a next major step, but this advance will depend on definitive and compelling demonstrations of reliability. Thus, an important goal of the research is to develop a complete methodology that will be embraced by physicians. Acute pediatric asthma has been identified as a domain where this new capability will be highly welcome. The research uses flexible and interactive algorithms for Region-of-Interest (ROI) processing. ROI processing is a useful approach to achieve the optimal balance in the quality-bandwidth tradeoff characteristic of visual communication services. The notion of ROI has been traditionally used mostly for foreground-background separation in scene rendering and manipulation, and only more recently for variably quality compression. Even when the latter goal is considered, quality criteria have been ad-hoc and at best useful for video conferencing, given that the medical domain has its own fidelity criteria. The research thus focuses on the design of an elastic ROI-based compression paradigm with medical diagnosis as a central criterion. The research describes the methodology to achieve elasticity through rate control algorithms at the encoder. An elastic approach is proposed that uses a priori user-specified video quality information, quantifies this information, and incorporates this into the encoder in the form of region-quality mappings. This method is compared to a parametric bit allocation approach that is based on region-features and a set of tuning weights. A number of videos of actual patients were filmed and used as the video database for the developed algorithms. In testing the elastic and parametric algorithms, both objective measures – in the form of Peak Signal to Noise Ratio (PSNR), and subjective evaluations were used. |
|
"Techniques
for Robust Communications of One-Way Video" Abstract by
Seong Hwan Jang The primary objective of the proposed research is to develop video coding systems over error prone networks for one-way transmission such as Video on Demand (VOD), digital broadcasting TV and video messaging. In this proposal, we investigate the efficient video coding and transmission algorithms using the characteristics of one-way video transmission systems. Recently, there has been a great demand for high quality visual services. In particular, one-way digital video services using pre-encoded video bit streams are becoming widely available via internet or wireless channels. However, due to bandwidth constraints and transmission errors, the transmitted and decoded video quality is still inadequate. The delay constraints for interactive real time video applications such as video conferencing make it even more difficult to effectively encode and transmit the video signal. On the other hand, there are some unique conditions for both source coding and transmission, in one-way video transmission systems. In one-way communication, the encoder is allowed to have much more coding delay, and can take advantage of this for effective coding and transmission. Video encoders have to operate within fixed bandwidth limitations unless network provides variable bit rate (VBR) transmission. The output bit rate of encoder increases as the video sequence has a large amount of motions or textures, while it decreases as the video sequence has stationary scenes. The exceeded output bit rate of encoder should be trimmed in order to meet the bandwidth constraints by an appropriate rate control algorithm. Higher compression ratios are possible at the cost of imperfect video source representation in decoder. Therefore, the video quality could fluctuate according to the video sequences at the same bit rate. In one-way video application, encoder can access to a limited number of subsequent future frames as well as to the current frame, for efficient temporal bit allocation and constant quality. The optimal bit allocation problem can be solved by Lagrangian Multiplier-based operational rate-distortion (R-D) frame work and some coding delay [4-8]. However, many popular video coding schemes involve dependent coding units such as motion compensations. Therefore, the set of available R-D operating points of predictive frame depends on the R-D points of reference frame. The complexity of solving optimal bit allocation problem using dependent R-D tree exponentially increases the progress of dependent tree depth. In order to ease this computational complexity, path-pruning algorithm or dynamic programming algorithm can be used to avoid the need to grow all the R-D data, while retaining optimality. The output of encoder is connected to a buffer whose purpose is to even out the fluctuations of variable rate and to transmit the output bit stream at constant bit rate (CBR). The range of bit allocation is constrained again to prevent the buffer from overflow or underflow. The constraints can be reduced by increasing buffer size. However, as encoder enlarges the size of buffer for more flexible bit allocation, initial delay in decoder buffer should be increased, too. If we use pseudo VBR scheme and increase the size of decoder buffer, we can have more flexibility in buffer constraints and bit allocation without increasing initial decoder buffer delay. The basic idea of pseudo VBR is that encoder can save and store bits in stationary sequences in order to use the stored bits in active sequences. This scheme is possible in one-way video coding by accessing and observing limited number of sequences to be coded. Even though video quality is optimized in the given bit rate constraints, the decoded quality can be severely damaged by transmission error. Error concealment scheme can be used to visually hide the damaged area, but the quality is not effective at packet error rate of more than 3%, since the error region is propagated both spatially and temporally [15–20]. The error propagation can be minimized by error resilient coding scheme. The basic idea of error resilient coding is to prevent spatial error propagation by marker bits insertion and temporal error propagation by INTRA mode update. The error resilient coding is not optimal in the sense of coding efficiency in error-free environments. The error resilient coding scheme can be different with the presence of network or channel feedback information. Therefore, we need to distinguish video coding scheme among error-free, error prone with feedback and error prone without feedback environments. To dynamically change the coding mode of pre-encoded bit stream with feedback, a transcoding scheme can be used. The main issue of transcoding is to minimize the computational complexity while minimizing quality degradation. Multiple Description coding (MDC) is a coding technique that generates independently coded multiple bit stream to transmit in separate path. Even though only one or a few description bit streams are successively received, the decoder can still reconstruct a lower, but acceptable quality. Object oriented source-channel coding is another approach to improve visual quality and error resilience. The main idea of the object oriented approach is to discriminate resource allocation between objects and non-objects, since human attention is usually on one dominant object. The object-oriented coding approaches are characterized by computationally intensive algorithms for segmenting objects, which is acceptable in one-way video. The approach can also provide adaptivity to the semantic content of video. Therefore, we can improve the visual quality and error resilience in one-way video coding system by efficient coding scheme and encoding delay. The error resilience scheme in one-way video should be different by the presence of feedback information from channel or networks. |
|
"Distributed Speech Recognition for Mobile Devices " Abstract by Brian Delaney Given that portable wireless devices are limited in computation, memory size, wireless bandwidth, and battery energy, distributing the speech recognition task across the network is an attractive alternative. Speech recognition can be a computationally demanding application that can easily use all available resources. An in-depth understanding of these issues in the context of a distributed speech recognition system will enable designers of future systems to build more efficient devices and algorithms. In particular, we will study the effects of wireless networking and fading channel characteristics on distributed speech recognition. We will investigate quality of service and energy trade-offs in this context. |
|
"Minimum Distortion Data Hiding For Compressed Images"
Abstract by Cagatay Candan In this chapter, we present a description of data hiding and introduce the problem examined and examine the motivation for this line of research. The motivation is described with a complicated video application to illustrate all of the requirements for similar applications. We have preferred to explicitly discuss this example, since most data hiding literature is not designed for this kind of an application, instead they are targeted towards copyright protection applications. The copyright application aims to embed an imperceptible data into a multimedia signal in such a way that embedded data is guaranteed to survive after deliberate attacks of hackers. For this application field, the level of distortion on the multimedia signal is a secondary factor in comparison with the robustness feature. The application area of our minimally distortive method is the communications applications. The previously mentioned standard compatible upgrade of the security of JPEG compression system is a good example for such applications. To further elaborate the application range, we have a discussion about a complicated video data broadcasting system. This discussion should pinpoint the necessary requirement or expectations from a data hiding system and illustrate the applicability of the this relatively new line of research. In this thesis, we focus still JPEG images which is the major sub-component of MPEG based video systems. |
|
| Communications & Networking | |
Wireless
Channel Modeling: Ray Tracing for Propagation Modeling: Abstracts by Junghyuck Jo Any type of cellular or personal communication system requires careful planning and prediction of signal coverage and interference levels. Unfortunately, this type of in-depth site planning requires tremendous amounts of measured data and trial-and-error testing that can often be prohibitively expensive. Therefore, a huge demand already exists in the wireless industry for the development of accurate propagation prediction technique, such as site-specific prediction of channel information using ray tracing method. The prediction of large-scale path loss has represented the dominant application of site-specific techniques. However, as computerized site information becomes available and as future wireless systems operate with higher bandwidths, the application of deterministic prediction techniques becomes very attractive. Deterministic methods could provide a wireless engineer with any number of channel parameters, including angle-of-arrival, delay spread, fading characteristics, and the complete channel impulse response. Site-specific techniques promise to facilitate the design of wireless modems by replacing the test and measurement with the convenience of computer simulations. In fact, the ability of a ray tracing algorithm to estimate the actual wideband channel impulse response with angle-of-arrival data may make significant advances in many other areas of wireless research that depend on spatial-temporal channel characteristics. These areas include position location, adaptive arrays, smart antennas, diversity, and equalization. |
|
Interference
Modeling between IEEE 802.11 and Bluetooth Wireless Systems IEEE 802.11 and Bluetooth radios share common spectrum in the 2.45GHz ISM band. The issue of coexistence between IEEE 802.11 Direct Sequence Spread Spectrum (DSSS) and Bluetooth radios with both radio types located within a mixed environment is studied. This study focuses exclusively on the reliability of the IEEE 802.11 wireless network in the presence of interference from Bluetooth radios. The reverse situation that is modeling interference from IEEE 802.11 to Bluetooth is not analyzed. |
|
| Cross
talk Cancellation in DSL Systems: Abstract by
Roberto A. Uzcategui No one would contend that the twisted pair access network does not represent one of the biggest investments made by the telephone companies. The importance of that asset explains why attempts to provide new services have focused on making the most out of the existing infrastructure. Digital subscriber line (DSL) systems transport information at broadband speeds over telephone cables, thus giving telephone companies the opportunity to compete with cable, wireless and satellite service providers for a share of the emerging broadband market without needing to overhaul their outside plant. Unfortunately, because it was designed for voice-grade communications, the physical makeup of telephone cables gives rise to a series of impairments at broadband speeds. One of them, whose impact is minor at voice-band frequencies but potentially crippling at wider bandwidths, is the capacitive and inductive coupling between the signals transported in different subscriber loops known as crosstalk. Crosstalk may be produced by a transmitter located at either the same (NEXT) or the opposite end (FEXT) of the cable relative to the position of the disturbed receiver. For research purposes, crosstalk is considered an additive phenomenon in which each individual interfering signal is produced by passing the information-bearing signal the other users through the appropriate coupling transfer functions. Since the resulting model is very similar to the multi-access channel model employed in wireless multiuser communications, it is intuitively appealing to try to use interference cancellation techniques similar to the ones used in wireless communications to mitigate crosstalk. Multiuser detection is one of such techniques. |
|
Multimedia Transport |
|
"Multimedia Communications Over Wireless Home Networks" Abstract by Babak Firoozbakhsh In our research, we have focused on a wide range of issues spanning the wireless homes. We started by a novel demonstration of the communication of vital signs from the body over WaveLAN wireless networks. We have also studied the feasibility of UWB in an indoor wireless area subjected to interference from IEEE 802.11a. We are currently focusing on conducting measurements in the Georgia Tech Residential Laboratory (Aware Home), as well as developing an optimized UWB medium access control (MAC) protocol specifically designed for high data rate wireless home networks. |
|
| Computing | |
"Mixed Initiative Multimedia for Mobile Devices: Design of a Semantically Relevant & Low Latency System for News Video Recommendations" Abstract by Jeannie Lee |
|
| "A Network-Aware Semantics-Sensitive Image Retrieval System" Abstract by Janghyun Yoon While significant progress has been made in content-based image retrieval over the last several years, there has been less work that has addressed issues related to overall system design from a networked system viewpoint. Since most of the image retrieval services are requested by remote users, possibly mobile users with limited device resources, the net-work environments of the CBIR systems will affect the overall performance of the image retrieval process. Currently, this process is optimized and well-tuned on stand-alone workstations with traditional performance metrics such as recall and precision. These metrics do not guarantee a satisfactory user experience in a general network scenario. In this dissertation, we investigate how to enhance semantic relevancy in the retrieval process by semantic feature extraction and relevance feedback. In addition, we propose prefetching and scalable image delivery (progressive and region of interest based delivery of images) to reduce network latency in the retrieval process and to adapt the retrieval process to the network bandwidth and the capabilities of the user device (especially, size of the display screen). However, these two goals, the maximization of semantic relevancy and the minimization of retrieval latency, sometimes conflict with each other. If the user's resources are limited, a slight sacrifice of semantic relevancy can im-prove the overall performance of image retrieval. As such, we investigate the issues on the joint optimization of these two goals as a specific goal of this dissertation. |
|
The MMC program is part of the School of ECE at Georgia Tech, with cross-disciplinary collaborations , particularly with the College of Computing. Industry connections are in the context of advanced telecommunications initiatives such as the Georgia Tech Broadband Institute and the Yamacraw and GCATT programs of the Georgia Research Alliance. We are also involved in specific student internships with industry partners such as Alcatel-Lucent, AT&T, Cox Communications, Cisco, HP Labs, NCR , Echostar, EG Technology, Tellabs and VQLink. |
|
Home | About MMC | Faculty Leader | Personnel | Research | Courses | Links
More information or problems with this page? Please contact web
master |
This page
has been accessed times since March 1, 2001.
This page was last modified January 2009
.
Copyright (C) 2009 MMC, Georgia Tech. All Rights Reserved.