Technical and Social Program

Our goal is to have an unforgettable face-to-face event in Istanbul for those who can make the trip and let those who cannot make the trip participate remotely.
Note to the open dataset/software and demo/industry papers from MMSys'20: You are cordially invited to present your work in Istanbul during MMSys'21.

Overview Keynotes Detailed Schedule Social Events

Overview

MMSys'21 and its co-located three workshops will be held in a hybrid mode in Turkey time zone (TRT; UTC+3). This means that the sessions will run normally based on the local time and the talks will be given in person or online (through the system to be provided) by one of the authors. Talks are allocated 20 minutes including the Q&A (see the exception for the open dataset/software, demo/industry and doctoral symposium papers below). Participants will be able to ask questions through the online system.

The authors have been asked to provide a short and a long version of their talks, and these videos and the slides will be shared with the conference participants before Sept. 28th.

The joint open dataset/software and demo/industry session (Sept. 29th, 4:00 pm) will start with 1-slide/minute intros given by the authors and then will break into individual virtual booths. Some of the authors will be on site and they can set up their booth in the morning on the 29th. This will allow the on-site participants to visit the booths during the day. To cater the online participants, however, every author must also be present in their virtual booths, which are listed here, until the session is over (i.e., 6:00 pm). The authors will present and demonstrate their work, and answer any questions the participants may have. Note that the open dataset/software and demo/industry papers will also have their short and long videos posted before the conference starts.

The doctoral symposium will start with 10 five-minute presentations, followed by a common Q&A.

Finally, we also have four awesome keynotes as detailed below and these will be presented live at the designated times.

Keynotes

  • Online game is a form of multimedia accounting for the biggest share of global digital media market revenues, with $21.1 billion revenue in 2020 and around 1 billion online gamers worldwide. AI and big data are driving the tremendous evolvement of the gaming industry, from intelligent market decisions, to data driven game development and smart live-ops. In this talk we will first give an overview of AI driven solutions across game product life cycle, followed by use cases demonstrating how big data can improve users’ overall experiences, including game AI, personalized in-game recommendation, ad creative understanding for user acquisition, automatic content generation of game video clips. With these novel applications, AI can take users’ experiences on engaging and interacting with multimedia in gaming to the next level.

    Qiaolin Chen is the Head of Data Science at Tencent IEG Global, leading efforts related data science solutions throughout game product life cycle, including data driven game design and production, game AI, user acquisition, LTV prediction, in-game recommendation, user life cycle analysis, virtual economy, user profile system. She joined Tencent Games in 2018, prior to which she was data scientist at SparkBeyond and a principal statistician at Novartis. She received her B.S. from Peking University and Ph.D. in Biostatistics from UCLA.

  • Caitlin Kalinowski heads up the VR Hardware team for Facebook Reality Labs, the division responsible for the Oculus Quest 2 and Touch controllers. Previous programs include the Oculus Rift and Rift S and Oculus Go. Before working at Oculus, Caitlin was a technical lead at Apple on the Mac Pro and MacBook Air products and was part of the original unibody MacBook Pro teams. Caitlin received her BS in Mechanical Engineering from Stanford University in 2007.
    Caitlin is passionate about increasing the number of women and other underrepresented minorities in the fields of technology and design. She believes the next generation of products must be designed and engineered by people with different backgrounds and experiences in order to output the best possible product. Caitlin is on the Board of Axon, and the strategic board of Lesbians Who Tech, largest women’s tech conference in California and the largest LGBTQ professional network in the world.

  • Recent AI breakthroughs in media creation techniques have opened up new possibilities for societally beneficial uses, but have also raised concerns about misuse. We can imagine translating a movie into any language in the world, and providing universal access to knowledge that was not possible before. This talk discusses recent trends in generative media creation tools for images, video, and sound, including new Movie Dubbing, Voice Cloning, Creative Photo Effects, DeepFakes for good and bad and, most importantly, CheapFakes. The latter include the most prevalent misinformation methods that are the hardest to detect automatically. We present efforts by Google and the community that are currently combating abuses, and we discuss long term solutions to the complex challenge of maintaining media integrity.

    Chris Bregler is a Director and Principal Scientist at Google AI. He received an Academy Award in the Oscar’s Science and Technology category for his work in visual effects. His other awards include the IEEE Longuet-Higgins Prize for "Fundamental Contributions in Computer Vision that Have Withstood the Test of Time," the Olympus Prize, and grants from the National Science Foundation, Packard Foundation, Electronic Arts, Microsoft, U.S. Navy, U.S. Airforce, and other agencies. Formerly a professor at New York University and Stanford University, he was named Stanford Joyce Faculty Fellow, Terman Fellow, and Sloan Research Fellow. In addition to working for several companies including Hewlett Packard, Interval, Disney Feature Animation, LucasFilm's ILM, and the New York Times, he was the executive producer of squid-ball.com, for which he built the world's largest real-time motion capture volume. He received his M.S. and Ph.D. in Computer Science from U.C. Berkeley.

  • Understanding of perceptual video/image quality is critical to achieve compact visual representations without compromising on what is relevant to the human eye. Compact representations drive improved customer satisfaction and lower the cost associated with storage and delivery of images/video. Subjective data-driven ML models are beginning to predict perceptual quality significantly better than ad-hoc, hard-to-compute biologically inspired models. This talk presents some examples of the great strides made in this space through ML techniques, the opportunities that have been unlocked by these, and the challenges that remain. It will also present some insights into Prime Video’s research collaborations with academic partners to overcome some of these challenges, and how we plan to leverage that capability.

    Sriram Sethuraman is a Sr. Principal Scientist in the Prime Video playback organization, leading efforts related to encoding optimization, video quality measurement, ML-based restoration, and next-generation video compression. He joined PV in July 2019, prior to which he was the CTO and Sr. VP at Ittiam Systems, a Bangalore-based multimedia technology venture. During his 17-year tenure at Ittiam, he was the architect of its technologies and products in the fields of video compression, video communication, media broadcast, and computer vision/machine learning. He has been part of MPEG-4, MPEG-7, and VVC standardization efforts. Prior to joining Ittiam, he served as a Senior Member of Technical Staff at Sarnoff Corporation. Sriram holds a Ph.D from CMU. He has 34 issued patents (and several pending patents) and is the author of more than 35 publications.

    Deepthi Nandakumar is a Principal Research Scientist, with the Amazon Go organization, working on the efficiency of large-scale computer vision pipelines. She has worked extensively in the Prime Video playback organization, leading efforts around video encoding optimization through content adaptive encoding, video quality measurement and next-generation compression schemes. Previously, she led the engineering and development of the world’s leading open-source HEVC encoder, x265, designing and optimizing for performance and encoding efficiency. She has a graduate degree from the University of Illinois, Urbana Champaign, working on heterogeneous computing and massively parallel devices.

Detailed Schedule


All times are local (UTC+3) and click on 'more' for session details

Lunch

Lunch

All Welcome

MMSys

MMSys (Main room)

Opening

MMVE

MMVE (Main room)

Session #1: Content Adaptation and Delivery (more)

MMVE

MMVE (Main room)

Session #2: Immersive Experiences (more)

Coffee Break

Coffee Break

All Welcome

Qiaolin Chen

Keynote (Main room)

Qiaolin Chen (Tencent IEG Global) (more)

GameSys

GameSys (Main room)

Session #1: Human-Game Interaction (more)

Day End

End of the Day

Freshen up

Social

Welcome Drinks and Dinner

All Welcome

Caitlin Kalinowski

Keynote (Main room)

Caitlin Kalinowski (Facebook) (more)

Coffee Break

Coffee Break

All Welcome

MMSys

MMSys (Main room)

Session #1: Immersive Media (more)

MMSys

MMSys (Main room)

Session #2: Live Video (more)

Lunch

Lunch

All Welcome

MMSys

MMSys (Main room)

Session #3: Content Preparation (more)

Coffee Break

Coffee Break

All Welcome

Grand Challenges

Grand Challenge #1 (Main room)

Detecting Cheapfakes (more)

Grand Challenges

Grand Challenge #2 (Main room)

Bandwidth Estimation for Real-Time Communications (more)

Open Dataset and Software

Open Dataset and Software (Main room)

Session #1: Software, Tools and Datasets (more)

Demo and Industry

Demo and Industry (Main room)

Session #1: Conventional and Immersive Encoding, Streaming and Analytics (more)

Day End

End of the Day

Freshen up

Social

Kebab Night and Outgoing

All Welcome

Chris Bregler

Keynote (Main room)

Chris Bregler (Google) (more)

Coffee Break

Coffee Break

All Welcome

MMSys

MMSys (Main room)

Session #4: Cloud-Based Multimedia Processing (more)

MMSys

MMSys (Main room)

Session #5: Multimedia in Outdoor and Mobile Environments (more)

Lunch

Lunch

All Welcome

MMSys

MMSys (Main room)

Session #6: Computer Vision Systems (more)

Coffee Break

Coffee Break

All Welcome

Doctoral Symposium

Doctoral Symposium (Main room)

Session #1: Next-Gen Researchers (more)

EDI

EDI Workshop (Main room)

Equality, Diversity and Inclusion Workshop (more)

Day End

End of the Day

Freshen up

Social

Ottoman Cuisine

All Welcome

Sriram Sethuraman and Deepthi Nandakumar

Keynote (Main room)

Sriram Sethuraman and Deepthi Nandakumar (Amazon) (more)

Coffee Break

Coffee Break

All Welcome

NOSSDAV

NOSSDAV (Main room)

Session #1: Yet Another Streaming Session (more)

NOSSDAV

NOSSDAV (Main room)

Session #2: "Fıstık Gibi" Video (more)

Lunch

Lunch

All Welcome

NOSSDAV

NOSSDAV (Main room)

Session #3: Deep Video (more)

Coffee Break

Coffee Break

All Welcome

NOSSDAV

NOSSDAV (Main room)

Session #4: Deeper Video (more)

MMSys

MMSys (Main room)

Closing and Awards (more)

Day End

End of the Day

Freshen up

Social

Fish Night, Bosphorus Tour and Streamers' Party

All Welcome

Social Events

MMSys has always been characterized by social opportunities to promote interaction within the community. This year, we will also have the postoned celebrations for the 30th anniversary of NOSSDAV and the 25th anniversary of Packet Video.

Join us for the following social events:

  • Sept. 28th: Welcome drinks and dinner
  • Sept. 29th: Kebab night and outgoing
  • Sept. 30th: Ottoman cuisine
  • Oct. 1st: Fish night, Bosphorus tour and streamers' party

All participants and their registered companions are welcome to attend these events.