A student at the University of Toronto has created a digital video identification system that may shine some light on the techniques used by YouTube and other Internet companies to protect copyright and place contextual ads in user-uploaded video clips.
The new system – dubbed Tiny Videos – has been developed by computer engineering graduate student Alex Karpenko and is able to reduce the size of videos by removing similar frames and downsizing the resolution to a mere 32×32 pixels. That leaves enough information for the system to identify other video clips containing the same footage.
“In a nutshell, it’s a way to search a large number of videos based on their content,” Karpenko says.
Since YouTube launched in 2005, user uploaded video content has grown exponentially on the Web. One problem YouTube encountered after being acquired by Google in 2006, is that many users don’t respect copyright. Another challenge was the site had to rely on user-typed metadata to properly label a video so other users could find what they need.
They copyright violation issue came to head in March 2007 when Viacom filed a $1 billion lawsuit against YouTube, claiming the site had more than 160,000 copyrighted clips from MTV, Comedy Central, and Paramount Pictures. The trial isn’t expected to be in court until later in 2009 or even 2010.
Meanwhile, Google has shared anonymous user data with Viacom and the company is using it to determine the extent of the copyright infringement.
YouTube’s policies against copyright infringement involve removal of videos upon complaint, and banning users who repeatedly violate copyright laws. On Oct. 15, the site launched YouTube Video Identification beta as a tool to help copyright holders manage their content.
“Video identification goes above and beyond our legal responsibilities,” wrote David King, YouTube product manager with Google in a blog post. “It will help copyright holders identify their works on YouTube, and choose what they want done with their videos.”
Little is known about the process used to actually identify video footage uploaded to YouTube. King says only that a unique “hash” of each video is created and matched against a database of content provided by copyright holders. If there is a match, the copyright holder is informed and given options to take action.
“Our Video Identification tools provide choice and control to content owners by combing a sophisticated policy engine with cutting-edge video matching technology,” adds Aaron Zamost, a YouTube spokesperson with Google, in an e-mail interview with ITBusiness.ca
There were some delays in the unveiling of YouTube’s identification technology, and co-founder Steve Chan called it “one of the most technologically complicated tasks we have ever undertaken.”
But U of T student Karpenko may have developed such a system with just a sampling of YouTube’s massive repetoire of videos. He spent four months collecting 50,000 videos from YouTube to use as a test dataset for his research. Karpenko describes it as the largest digital video dataset available.
To describe how the system works, the graduate student gives an example of a video interview involving a subject and an interviewer.
“Even if it goes on for hours, there’s very little difference in the visual information,” he says. “The clustering algorithm will discard all the similar frames and select only one frame for the shot.”
Watching a video that had been chewed up by Karpenko’s algorithm wouldn’t make much sense to a human. It would be a tiny thumbnail size, less than a second in length, and there would be no audio. But he says it’s enough to match it to other videos containing the same information. That could link the video to longer versions of the same clip, or a different network’s version of the same news story.
“The tools you can use to find videos [on YouTube] are very basic,” he says. “There are better ways to organize this data and let users know about other related videos.”
The video sharing site doesn’t use its current Video Identification tool to find related videos, according to YouTube sources. That capability is based on user-entered labels, as well as user behaviour (videos watched by users are linked together in a series).
But companies offering the service of tracking online video and matching it with copyright holders are more interested in advertising arrangements.
Mountain View, Calif.-based Anvato Inc. is marketing its Perceptual Signature technology as the closest emulation of human vision available. Currently the service receives about 250,000 requests a day from undisclosed clients looking to guard copyright protection. But the long term goal is to make widespread video distribution legal.
“Since we can detect and track which videos belong to whom, we can facilitate that business model,” says Alper Turgut, president and CEO of Anvato. “We propose inserting advertising and taking a share of that advertising because it doesn’t block viewership and it adds value.”
The Anvato system creates a compressed fingerprint that can identify similar footage, just like Karpenko’s Tiny Video system. But the way it creates such an algorithm is very different, and according to Turgut, more complex.
“The system tries to break a video down into its components such as shapes, colours, and the rest of those human-perceived artifacts in a video,” he says. “It accesses the video buffer and acts like a human being watching the video over and over again.”
The technology has been trained to identify certain types of objects that it knows by comparing objects in a video to examples it already has. The system can be trained to identify new objects in a video so long as it is larger than 30 pixels in size. It is capable of identifying human faces as well.
Currently, the system is scanning videos uploaded to YouTube, Veoh, Meta Café, Google Video and DailyMotion.
Los Angeles-based Auditude is another company that offers a video-matching service through a patented fingerprinting technology. The company announced Nov. 3 a partnership with social network MySpace and Viacom’s MTV Networks that will allow MTV to advertise in user-uploaded videos.
“We provide our fingerprinting technology married to an ad platform to enable widely distributed media monetization,” says Adam Cahan, president and CEO of Auditude. “We manage official content distribution to ensure ads follow the videos as well as the audience syndicated piece.”
Auditude scours the Web for user-generated content and has examined more than one billion minutes of video so far, Cahan says. The firm also sidesteps the issue of copyright holders providing a reference file by fingerprinting programs broadcast on television, giving the company the largest index of professional content in the world.
As for Karpenko, the Toronto student says he has no interest in starting up his own video Web site. But he is interested in approaching YouTube about his system after he completes his masters’ degree.