Overview
MMSys'21 and its co-located three workshops will be held in a hybrid mode in Turkey time zone (TRT; UTC+3). This means that the sessions will run normally based on the local time and the talks will be given in person or online (through the system to be provided) by one of the authors. Talks are allocated 20 minutes including the Q&A (see the exception for the open dataset/software, demo/industry and doctoral symposium papers below). Participants will be able to ask questions through the online system.
The authors have been asked to provide a short and a long version of their talks, and these videos and the slides will be shared with the conference participants before Sept. 28th.
The joint open dataset/software and demo/industry session (Sept. 29th, 4:00 pm) will start with 1-slide/minute intros given by the authors and then will break into individual virtual booths. Some of the authors will be on site and they can set up their booth in the morning on the 29th. This will allow the on-site participants to visit the booths during the day. To cater the online participants, however, every author must also be present in their virtual booths, which are listed here, until the session is over (i.e., 6:00 pm). The authors will present and demonstrate their work, and answer any questions the participants may have. Note that the open dataset/software and demo/industry papers will also have their short and long videos posted before the conference starts.
The doctoral symposium will start with 10 five-minute presentations, followed by a common Q&A.
Finally, we also have four awesome keynotes as detailed below and these will be presented live at the designated times.
Keynotes
- AI Driven Solutions throughout Games' Lifecycles Leveraging Big Data
by Qiaolin Chen (Tencent IEG Global)Online game is a form of multimedia accounting for the biggest share of global digital media market revenues, with $21.1 billion revenue in 2020 and around 1 billion online gamers worldwide. AI and big data are driving the tremendous evolvement of the gaming industry, from intelligent market decisions, to data driven game development and smart live-ops. In this talk we will first give an overview of AI driven solutions across game product life cycle, followed by use cases demonstrating how big data can improve users’ overall experiences, including game AI, personalized in-game recommendation, ad creative understanding for user acquisition, automatic content generation of game video clips. With these novel applications, AI can take users’ experiences on engaging and interacting with multimedia in gaming to the next level.
Qiaolin Chen is the Head of Data Science at Tencent IEG Global, leading efforts related data science solutions throughout game product life cycle, including data driven game design and production, game AI, user acquisition, LTV prediction, in-game recommendation, user life cycle analysis, virtual economy, user profile system. She joined Tencent Games in 2018, prior to which she was data scientist at SparkBeyond and a principal statistician at Novartis. She received her B.S. from Peking University and Ph.D. in Biostatistics from UCLA.
- Making Impossible Products: How to Get 0-to-1 Products Right
by Caitlin Kalinowski (Facebook)Caitlin Kalinowski heads up the VR Hardware team for Facebook Reality Labs, the division responsible for the Oculus Quest 2 and Touch controllers. Previous programs include the Oculus Rift and Rift S and Oculus Go. Before working at Oculus, Caitlin was a technical lead at Apple on the Mac Pro and MacBook Air products and was part of the original unibody MacBook Pro teams. Caitlin received her BS in Mechanical Engineering from Stanford University in 2007.
Caitlin is passionate about increasing the number of women and other underrepresented minorities in the fields of technology and design. She believes the next generation of products must be designed and engineered by people with different backgrounds and experiences in order to output the best possible product. Caitlin is on the Board of Axon, and the strategic board of Lesbians Who Tech, largest women’s tech conference in California and the largest LGBTQ professional network in the world. - Synthetic Media: New Opportunities and New Challenges
by Chris Bregler (Google)Recent AI breakthroughs in media creation techniques have opened up new possibilities for societally beneficial uses, but have also raised concerns about misuse. We can imagine translating a movie into any language in the world, and providing universal access to knowledge that was not possible before. This talk discusses recent trends in generative media creation tools for images, video, and sound, including new Movie Dubbing, Voice Cloning, Creative Photo Effects, DeepFakes for good and bad and, most importantly, CheapFakes. The latter include the most prevalent misinformation methods that are the hardest to detect automatically. We present efforts by Google and the community that are currently combating abuses, and we discuss long term solutions to the complex challenge of maintaining media integrity.
Chris Bregler is a Director and Principal Scientist at Google AI. He received an Academy Award in the Oscar’s Science and Technology category for his work in visual effects. His other awards include the IEEE Longuet-Higgins Prize for "Fundamental Contributions in Computer Vision that Have Withstood the Test of Time," the Olympus Prize, and grants from the National Science Foundation, Packard Foundation, Electronic Arts, Microsoft, U.S. Navy, U.S. Airforce, and other agencies. Formerly a professor at New York University and Stanford University, he was named Stanford Joyce Faculty Fellow, Terman Fellow, and Sloan Research Fellow. In addition to working for several companies including Hewlett Packard, Interval, Disney Feature Animation, LucasFilm's ILM, and the New York Times, he was the executive producer of squid-ball.com, for which he built the world's largest real-time motion capture volume. He received his M.S. and Ph.D. in Computer Science from U.C. Berkeley.
- Role of ML in the Prediction of Perceptual Video Quality
by Sriram Sethuraman and Deepthi Nandakumar (Amazon)Understanding of perceptual video/image quality is critical to achieve compact visual representations without compromising on what is relevant to the human eye. Compact representations drive improved customer satisfaction and lower the cost associated with storage and delivery of images/video. Subjective data-driven ML models are beginning to predict perceptual quality significantly better than ad-hoc, hard-to-compute biologically inspired models. This talk presents some examples of the great strides made in this space through ML techniques, the opportunities that have been unlocked by these, and the challenges that remain. It will also present some insights into Prime Video’s research collaborations with academic partners to overcome some of these challenges, and how we plan to leverage that capability.
Sriram Sethuraman is a Sr. Principal Scientist in the Prime Video playback organization, leading efforts related to encoding optimization, video quality measurement, ML-based restoration, and next-generation video compression. He joined PV in July 2019, prior to which he was the CTO and Sr. VP at Ittiam Systems, a Bangalore-based multimedia technology venture. During his 17-year tenure at Ittiam, he was the architect of its technologies and products in the fields of video compression, video communication, media broadcast, and computer vision/machine learning. He has been part of MPEG-4, MPEG-7, and VVC standardization efforts. Prior to joining Ittiam, he served as a Senior Member of Technical Staff at Sarnoff Corporation. Sriram holds a Ph.D from CMU. He has 34 issued patents (and several pending patents) and is the author of more than 35 publications.
Deepthi Nandakumar is a Principal Research Scientist, with the Amazon Go organization, working on the efficiency of large-scale computer vision pipelines. She has worked extensively in the Prime Video playback organization, leading efforts around video encoding optimization through content adaptive encoding, video quality measurement and next-generation compression schemes. Previously, she led the engineering and development of the world’s leading open-source HEVC encoder, x265, designing and optimizing for performance and encoding efficiency. She has a graduate degree from the University of Illinois, Urbana Champaign, working on heterogeneous computing and massively parallel devices.
Detailed Schedule
All times are local (UTC+3) and click on 'more' for session details

Lunch
All Welcome

MMSys (Main room)
Opening

MMVE (Main room)
Session #1: Content Adaptation and Delivery (more)

MMVE (Main room)
Session #2: Immersive Experiences (more)

Coffee Break
All Welcome

GameSys (Main room)
Session #1: Human-Game Interaction (more)

End of the Day
Freshen up

Welcome Drinks and Dinner
All Welcome

Coffee Break
All Welcome

MMSys (Main room)
Session #1: Immersive Media (more)

MMSys (Main room)
Session #2: Live Video (more)

Lunch
All Welcome

MMSys (Main room)
Session #3: Content Preparation (more)

Coffee Break
All Welcome

Grand Challenge #1 (Main room)
Detecting Cheapfakes (more)

Grand Challenge #2 (Main room)
Bandwidth Estimation for Real-Time Communications (more)

Open Dataset and Software (Main room)
Session #1: Software, Tools and Datasets (more)

Demo and Industry (Main room)
Session #1: Conventional and Immersive Encoding, Streaming and Analytics (more)

End of the Day
Freshen up

Kebab Night and Outgoing
All Welcome

Coffee Break
All Welcome

MMSys (Main room)
Session #4: Cloud-Based Multimedia Processing (more)

MMSys (Main room)
Session #5: Multimedia in Outdoor and Mobile Environments (more)

Lunch
All Welcome

MMSys (Main room)
Session #6: Computer Vision Systems (more)

Coffee Break
All Welcome

Doctoral Symposium (Main room)
Session #1: Next-Gen Researchers (more)

EDI Workshop (Main room)
Equality, Diversity and Inclusion Workshop (more)

End of the Day
Freshen up

Ottoman Cuisine
All Welcome

Coffee Break
All Welcome

NOSSDAV (Main room)
Session #1: Yet Another Streaming Session (more)

NOSSDAV (Main room)
Session #2: "Fıstık Gibi" Video (more)

Lunch
All Welcome

NOSSDAV (Main room)
Session #3: Deep Video (more)

Coffee Break
All Welcome

NOSSDAV (Main room)
Session #4: Deeper Video (more)

MMSys (Main room)
Closing and Awards (more)

End of the Day
Freshen up

Fish Night, Bosphorus Tour and Streamers' Party
All Welcome
Social Events
MMSys has always been characterized by social opportunities to promote interaction within the community. This year, we will also have the postoned celebrations for the 30th anniversary of NOSSDAV and the 25th anniversary of Packet Video.
Join us for the following social events:
All participants and their registered companions are welcome to attend these events.