Imported: 13 Feb '17 | Published: 18 Jan '11
USPTO - Utility Patents
A variable-bit-rate (VBR)-encoding of a plurality of clips from a plurality of video content items is performed to produce a VBR-encoded aggregated video content item. The VBR-encoding of each of the clips is based on an encoding complexity of at least one other one of the clips. This can be performed by combining the clips into aggregated video content and two-pass VBR-encoding the aggregated video content. A video quality test may be performed using the VBR-encoded aggregated video content item.
The present disclosure relates to video quality testing.
Variable-bit-rate (VBR) encoding refers to a video encoder's ability to vary an amount of data used to encode a scene based on an overall complexity of information being encoded. A video scene with a large amount of detail and movement, for example, may be encoded using more data than a scene that is relatively simple and lacks motion.
Pre-encoding refers to encoding an entire video before the video is delivered. Pre-encoding may involve an encoder analyzing the entire video to be encoded to generate a more sophisticated encoding. When encoding a movie, for example, the entire movie can be analyzed to determine which parts of the movie are relatively complex and which parts are relatively simple. The encoder can pre-allocate, from an overall data budget, a specific amount of data for each section of the movie. Such an approach allows the encoder to maintain a limit on the overall size of an output file while optimizing the allocation of data to specific parts of the movie. Video quality can be optimized within the constraint of an overall file size.
The combination of VBR and pre-encoding enables higher-quality video (when compared to other encoding mechanisms) to be delivered using the same amount of data. Pre-encoded VBR content is a common type of encoded content for many download-to-play video services found on the Internet. Encoding schemes and/or video distribution and display systems are evaluated using video quality and acceptability test procedures. These procedures often involve multiple pieces of video content to provide a diverse set over which to test.
The multiple pieces of video content used in a video quality test may comprise multiple full-length movies, for example. For purposes of illustration and example, consider a video quality test that involves thirty full-length movies having an average length of roughly ninety minutes per movie. To test multiple different encoding schemes, a significant amount of time is required to encode the multiple full-length movies in each of the encoding schemes.
To save money on costs of encoding in a video quality test, a single clip from each movie may be encoded rather than the entire movie. For example, a three-minute clip may be encoded rather than an entire ninety-minute movie to reduce an amount of encoding by a factor of about 30×. However, for testing pre-encoded VBR schemes, a first amount of data from an encoding of the three-minute clip by itself is not necessarily the same as a second amount of data allocated to the three-minute clip from an encoding of the entire movie. An analysis of test data has shown that the first and second amount of data can substantially differ, in practice.
Embodiments of the present disclosure address this problem without having to resort to encoding all thirty movies in their entirety. The multiple clips from the different movies are combined into a single, aggregated piece of video content. VBR pre-encoding, such as two-pass VBR encoding, is performed on the aggregated piece of video content to produce a VBR pre-encoded aggregated video content item. By encoding the aggregated piece of video content in this manner, each clip is encoded within a larger context of the other clips. As a result, more complex clips are encoded using more data while simpler clips are encoded using less data. The total amount of data for encoding the aggregated clips can be the same as for individually encoding the clips, but the distribution of data can vary from clip-to-clip similar to if the clips were taken from fully-encoded movies. Thus, the VBR pre-encoded aggregated video content item has encoded clips that more closely match clips from full-length encodings without requiring the full-length encodings to be generated (e.g. encoding all thirty full-length movies). As a result, a meaningful video quality test can be performed for an encoding of 90 minutes (30 movies×3 minutes/movie) of data rather than 30 full-length movies of data that normally would have been required.
Embodiments of the present disclosure are described with reference to FIG. 1, which is a flow chart of an embodiment of a method of performing a video quality test, and FIG. 2, which is a block diagram of an embodiment of a system for performing a video quality test.
As indicated by block 10, the method comprises VBR pre-encoding an aggregation of a plurality of clips from a plurality of video content items. This act produces a VBR pre-encoded aggregated video content item. Because the aggregation is being VBR pre-encoded, each of the clips is encoded based on an encoding complexity of at least one other one of the clips. In some embodiments, each of the clips is VBR-encoded based on an encoding complexity of the clip relative to encoding complexities of some or all others of the clips. In one embodiment, the act of VBR pre-encoding comprises performing a two-pass VBR-encoding of a single, aggregated piece of video content that concatenates or otherwise combines the plurality of clips. The two-pass VBR-encoding includes a first pass that analyzes the entire single, aggregated piece of video content, and a second pass that encodes the single, aggregated piece of video content based on the analysis.
As described above, the plurality of video content items may comprise a plurality of movies. The movies may comprise cinematic movies, made-for-television movies or other movies produced for a mass audience, for example. The movies may be individually-released and/or individually-purchasable movies. In these cases, each movie normally can be purchased, rented, downloaded or viewed independently of the other movies.
In some embodiments, each clip is of a duration substantially less than a duration of its source video content item. As described above, the duration of a clip may be about three minutes relative to a ninety-minute source video content item.
The duration of each clip is selected to be long enough so that human subjects in a video quality test can become psychologically involved in viewing the clip. However, the duration of each clip is selected to be short enough to avoid prolonging the amount of time each human subject is watching clips and the amount of time needed to encode the aggregated clips. In some embodiments, the duration of each clip is within a range of about one minute to about two minutes. In other embodiments, the duration of each clip is within a range of about one minute to about three minutes.
In some embodiments, some, most or all of the clips have about the same duration. As described above, some, most or all of the clips may have a duration of about three minutes. In alternative embodiments, the clips may have different durations.
The number of clips that are concatenated to form the single, aggregated piece of video content is selected to provide a sufficiently-large sample size to perform a statistical analysis of the human subjects' evaluations of overall quality. However, the number of clips is selected to be small enough to avoid prolonging the amount of time each human subject is watching clips and the amount of time needed to encode the aggregated clips. In some embodiments, the number of clips is within a range of about ten clips to about fifty clips, with thirty clips being used in one embodiment.
For purposes of illustration and example, consider four clips 12, 14, 16 and 18 from four video content items 22, 24, 26 and 28, respectively, as shown in FIG. 2. A combiner 30 concatenates or otherwise combines the four clips 12, 14, 16 and 18 into a single, aggregated piece of video content 32. A VBR encoder 33 VBR-encodes the aggregated piece of video content 32 to produce a VBR pre-encoded aggregated video content item 34. The VBR encoder 33 may perform a two-pass VBR encoding of the aggregated piece of video content 32 to produce the VBR pre-encoded aggregated video content item 34.
After the encoding has completed, the method comprises performing a video quality test using the VBR-encoded aggregated video content item 34, as indicated by block 36. The video quality test is performed by a video quality test apparatus 40 that decodes and plays back the VBR-encoded aggregated video content item 34. The video quality test may be performed using human subjects, who view a display of the decoded content and provide one or more subjective ratings of the video quality. Alternatively, the video quality test may be automated using a computer that rates characteristics of the decoded content.
The acts indicated by blocks 10 and 36 may be repeated for multiple different VBR encoding schemes, parameters and modes. The resulting multiple video quality tests can be analyzed to determine desirable VBR encoding scheme(s), parameter(s) and mode(s) based on one or more video quality objectives and one or more constraints (e.g. bandwidth or bit rate constraints).
In addition to quality testing, the herein-disclosed encoding method and system can be used to enhance the video quality of commercials embedded in longer content items such as television programs or movies. Separately encoding the longer content and the commercials can result in quality discontinuities from content-to-commercials and/or from commercials-to-content. For example, if a relatively complex-to-encode commercial is embedded in a relatively simple-to-encode program, a total amount of data available for a relatively short commercial may be insufficient to encode the commercial with a desirable level of quality. Encoding the aggregated program and commercial would cause the program to be encoded with less data than if encoded individually. The resulting unused data can be applied to the commercial. The resulting quality of the commercial is higher with only a small degradation to the rest of the program. Further, the viewer experiences a more consistent video quality.
The herein-described components may be embodied by one or more computer processors directed by computer-readable program code stored by a computer-readable medium.
Referring to FIG. 3, an illustrative embodiment of a general computer system is shown and is designated 300. The computer system 300 can include a set of instructions that can be executed to cause the computer system 300 to perform any one or more of the methods or computer based functions disclosed herein. The computer system 300 may operate as a standalone device or may be connected, e.g., using a network, to other computer systems or peripheral devices.
In a networked deployment, the computer system may operate in the capacity of a server or as a client user computer in a server-client user network environment, or as a peer computer system in a peer-to-peer (or distributed) network environment. The computer system 300 can also be implemented as or incorporated into various devices, such as a personal computer (PC), a tablet PC, a set-top box (STB), a personal digital assistant (PDA), a mobile device, a palmtop computer, a laptop computer, a desktop computer, a communications device, a wireless telephone, a land-line telephone, a control system, a camera, a scanner, a facsimile machine, a printer, a pager, a personal trusted device, a web appliance, a network router, switch or bridge, or any other machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. In a particular embodiment, the computer system 300 can be implemented using electronic devices that provide voice, video or data communication. Further, while a single computer system 300 is illustrated, the term “system” shall also be taken to include any collection of systems or sub-systems that individually or jointly execute a set, or multiple sets, of instructions to perform one or more computer functions.
As illustrated in FIG. 3, the computer system 300 may include a processor 302, e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both. Moreover, the computer system 300 can include a main memory 304 and a static memory 306, that can communicate with each other via a bus 308. As shown, the computer system 300 may further include a video display unit 310, such as a liquid crystal display (LCD), an organic light emitting diode (OLED), a flat panel display, a solid state display, or a cathode ray tube (CRT). Additionally, the computer system 300 may include an input device 312, such as a keyboard, and a cursor control device 314, such as a mouse. The computer system 300 can also include a disk drive unit 316, a signal generation device 318, such as a speaker or remote control, and a network interface device 320.
In a particular embodiment, as depicted in FIG. 3, the disk drive unit 316 may include a computer-readable medium 322 in which one or more sets of instructions 324, e.g. software, can be embedded. Further, the instructions 324 may embody one or more of the methods or logic as described herein. In a particular embodiment, the instructions 324 may reside completely, or at least partially, within the main memory 304, the static memory 306, and/or within the processor 302 during execution by the computer system 300. The main memory 304 and the processor 302 also may include computer-readable media.
In an alternative embodiment, dedicated hardware implementations, such as application specific integrated circuits, programmable logic arrays and other hardware devices, can be constructed to implement one or more of the methods described herein. Applications that may include the apparatus and systems of various embodiments can broadly include a variety of electronic and computer systems. One or more embodiments described herein may implement functions using two or more specific interconnected hardware modules or devices with related control and data signals that can be communicated between and through the modules, or as portions of an application-specific integrated circuit. Accordingly, the present system encompasses software, firmware, and hardware implementations.
In accordance with various embodiments of the present disclosure, the methods described herein may be implemented by software programs executable by a computer system. Further, in an exemplary, non-limited embodiment, implementations can include distributed processing, component/object distributed processing, and parallel processing. Alternatively, virtual computer system processing can be constructed to implement one or more of the methods or functionality as described herein.
The present disclosure contemplates a computer-readable medium that includes instructions 324 or receives and executes instructions 324 responsive to a propagated signal, so that a device connected to a network 326 can communicate voice, video or data over the network 326. Further, the instructions 324 may be transmitted or received over the network 326 via the network interface device 320.
While the computer-readable medium is shown to be a single medium, the term “computer-readable medium” includes a single medium or multiple media, such as a centralized or distributed database, and/or associated caches and servers that store one or more sets of instructions. The term “computer-readable medium” shall also include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor or that cause a computer system to perform any one or more of the methods or operations disclosed herein.
In a particular non-limiting, exemplary embodiment, the computer-readable medium can include a solid-state memory such as a memory card or other package that houses one or more non-volatile read-only memories. Further, the computer-readable medium can be a random access memory or other volatile re-writable memory. Additionally, the computer-readable medium can include a magneto-optical or optical medium, such as a disk or tapes or other storage device to capture carrier wave signals such as a signal communicated over a transmission medium. A digital file attachment to an e-mail or other self-contained information archive or set of archives may be considered a distribution medium that is equivalent to a tangible storage medium. Accordingly, the disclosure is considered to include any one or more of a computer-readable medium or a distribution medium and other equivalents and successor media, in which data or instructions may be stored.
Although the present specification describes components and functions that may be implemented in particular embodiments with reference to particular standards and protocols, the invention is not limited to such standards and protocols. For example, standards for Internet and other packet switched network transmission (e.g., TCP/IP, UDP/IP, HTML, HTTP) represent examples of the state of the art. Such standards are periodically superseded by faster or more efficient equivalents having essentially the same functions. Accordingly, replacement standards and protocols having the same or similar functions as those disclosed herein are considered equivalents thereof.
The illustrations of the embodiments described herein are intended to provide a general understanding of the structure of the various embodiments. The illustrations are not intended to serve as a complete description of all of the elements and features of apparatus and systems that utilize the structures or methods described herein. Many other embodiments may be apparent to those of skill in the art upon reviewing the disclosure. Other embodiments may be utilized and derived from the disclosure, such that structural and logical substitutions and changes may be made without departing from the scope of the disclosure. Additionally, the illustrations are merely representational and may not be drawn to scale. Certain proportions within the illustrations may be exaggerated, while other proportions may be minimized. Accordingly, the disclosure and the figures are to be regarded as illustrative rather than restrictive.
One or more embodiments of the disclosure may be referred to herein, individually and/or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any particular invention or inventive concept. Moreover, although specific embodiments have been illustrated and described herein, it should be appreciated that any subsequent arrangement designed to achieve the same or similar purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all subsequent adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the description.
The Abstract of the Disclosure is provided to comply with 37 C.F.R. §1.72(b) and is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, various features may be grouped together or described in a single embodiment for the purpose of streamlining the disclosure. This disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter may be directed to less than all of the features of any of the disclosed embodiments. Thus, the following claims are incorporated into the Detailed Description, with each claim standing on its own as defining separately claimed subject matter.
The above disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover all such modifications, enhancements, and other embodiments which fall within the true spirit and scope of the present invention. Thus, to the maximum extent allowed by law, the scope of the present invention is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description.