Pretesting of Finished Ads
Pretesting finished ads is one of the more commonly employed studies among marketing researchers and their agencies. At this stage, a finished advertisement or commercial is used; since it has not been presented to the market, changes can still be made.
Many researchers believe testing the ad in final form provides better information. Several test procedures are available for print and broadcast ads, including both laboratory and field methodologies.
Print methods include portfolio tests, analyses of readability, and dummy advertising vehicles. Broadcast tests include theater tests and on-air tests. Both print and broadcast may use physiological measures.
Pretesting Finished Print Messages Anumber of methods for pretesting finished print ads are available. One is Diagnostic Research Inc.'s Copytest System, described in Figure 19-9. The most common of these methods are portfolio tests, readability tests, and dummy advertising vehicles.
Portfolio Tests Portfolio tests are a laboratory methodology designed to expose a group of respondents to a portfolio consisting of both control and test ads. Respondents are then asked what information they recall from the ads. The assumption is that the ads that yield the highest recall are the most effective.
While portfolio tests offer the opportunity to compare alternative ads directly, a number of weaknesses limit their applicability:
1. Factors other than advertising creativity and/or presentation may affect recall. Interest in the product or product category, the fact that respondents know they are participating in a test, or interviewer instructions (among others) may account for more differences than the ad itself.
2. Recall may not be the best test. Some researchers argue that for certain types of products (those of low involvement) ability to recognize the ad when shown may be a better measure than recall.
One way to determine the validity of the portfolio method is to correlate its results with readership scores once the ad is placed in the field. Whether such validity tests are being conducted or not is not readily known, although the portfolio method remains popular in the industry.
Objective: Method:
Output:
Tests recall and readers' impressions of print ads.
Mall intercepts in two or more cities are used to screen respondents and have them take home "test magazines" for reading. Participants are phoned the next day to determine opinions of the ads, recall of ad contents, and other questions of interest to the sponsor. Approximately 225 people constitute the sample.
Scores reported include related recall of copy and visual elements, sales messages, and other nonspecific elements. Both quantitative (table) scores and verbatim responses are reported.
Figure 19-9 Diagnostic Research Inc.'s print test
Figure 19-9 Diagnostic Research Inc.'s print test
Figure 19-10 Ipsos-ASI's Next*Print
Readability Tests The communications efficiency of the copy in a print ad can be tested without reader interviews. This test uses the Flesch formula, named after its developer, Rudolph Flesch, to assess readability of the copy by determining the average number of syllables per 100 words. Human interest appeal of the material, length of sentences, and familiarity with certain words are also considered and correlated with the educational background of target audiences. Test results are compared to previously established norms for various target audiences. The test suggests that copy is best comprehended when sentences are short, words are concrete and familiar, and personal references are drawn.
This method eliminates many of the interviewee biases associated with other tests and avoids gross errors in understanding. The norms offer an attractive standard for comparison.
Disadvantages are also inherent, however. The copy may become too mechanical, and direct input from the receiver is not available. Without this input, contributing elements like creativity cannot be addressed. To be effective, this test should be used only in conjunction with other pretesting methods.
Dummy Advertising Vehicles In an improvement on the portfolio test, ads are placed in "dummy" magazines developed by an agency or research firm. The magazines contain regular editorial features of interest to the reader, as well as the test ads, and are distributed to a random sample of homes in predetermined geographic areas. Readers are told the magazine publisher is interested in evaluations of editorial content and asked to read the magazines as they normally would. Then they are interviewed on their reactions to both editorial content and ads. Recall, readership, and interest-generating capabilities of the ad are assessed.
The advantage of this method is that it provides a more natural setting than the portfolio test. Readership occurs in the participant's own home, the test more closely approximates a natural reading situation, and the reader may go back to the magazine, as people typically do.
But the dummy magazine shares the other disadvantages associated with portfolio tests. The testing effect is not eliminated, and product interest may still bias the results. Thus, while this test offers some advantages over the portfolio method, it is not a guaranteed measure of the advertising's impact.
While all the previously described measures are available, the most popular form of pretesting of print ads now involves a series of measures. Companies like Gallup & Robinson and Ipsos-ASI offer copy testing services that have improved upon many of the shortcomings cited above. The tests can be used for rough and/or finished ads and are most commonly conducted in the respondents' homes. For example, Gallup & Robinson's Magazine Impact Research Service (MIRS) uses an at-home, in-magazine context, employing widely dispersed samples, and offers standardized measures as well as a variety of options. Ipsos-ASI's Next*Print methodology also offers multiple measures, as shown in Figure 19-10.
Pretesting Finished Broadcast Ads Avariety of methods for pretesting broadcast ads are available. The most popular are theater tests, on-air tests, and physiological measures.
Figure 19-10 Ipsos-ASI's Next*Print
Objective: To assist advertisers in copy testing of print advertisements to determine (1) main idea communication, (2) likes and dislikes, (3) believability, (4) ad attribute ratings, (5) overall likeability, and (6) brand attribute ratings.
Method: Tests are conducted in current issues of newsstand magazines such as People, Better Homes & Gardens, and Newsweek. The recall measure consists of 150 responses. Diagnostic measures range from 105 to 150 responses. Highly targeted audiences are available through a version known as the Targeted Print Test.
Output: Standard scores and specific diagnostics.
Theater Tests In the past, one of the most popular laboratory methods for pretesting finished commercials was theater testing. In theater tests participants are invited by telephone, mall intercepts, and/or tickets in the mail to view pilots of proposed TV programs. In some instances, the show is actually being tested, but more commonly a standard program is used so audience responses can be compared with normative responses established by previous viewers. Sample sizes range from 250 to 600 participants.
On entering the theater, viewers are told a drawing will be held for gifts and are asked to complete a product preference questionnaire asking which products they would prefer if they win. This form also requests demographic data. Participants may be seated in specific locations in the theater to allow observation by age, sex, and so on. They view the program and commercials, and a form asking for evaluations is distributed. Participants are then asked to complete a second form for a drawing so that changes in product preference can be noted. In addition to product/brand preference, the form may request other information:
1. Interest in and reaction to the commercial.
2. Overall reaction to the commercial as measured by an adjective checklist.
3. Recall of various aspects of the commercial.
4. Interest in the brand under consideration.
5. Continuous (frame-by-frame) reactions throughout the commercial.
The methods of theater testing operations vary, though all measure brand preference changes. For example, many of the services now use videotaped programs with the commercials embedded for viewing in one's office rather than in a theater. Others establish viewing rooms in malls and/or hotel conference rooms. Some do not take all the measures listed here; others ask the consumers to turn dials or push buttons on a keypad to provide the continual responses. An example of one methodology is shown in Figure 19-11.
Those opposed to theater tests cite a number of disadvantages. First, they say the environment is too artificial. The lab setting is bad enough, but asking respondents to turn dials or, as one service does, wiring people for physiological responses takes them too far from a natural viewing situation. Second, the contrived measure of brand preference change seems too phony to believe. Critics contend that participants will see through it and make changes just because they think they are supposed to. Finally, the group effect of having others present and overtly exhibiting their reactions may influence viewers who did not have any reactions themselves.
Proponents argue that theater tests offer distinct advantages. In addition to control, the established norms (averages of commercials' performances) indicate how one's commercial will fare against others in the same product class that were already tested. Further, advocates say the brand preference measure is supported by actual sales results.
Despite the limitations of theater testing, most major consumer-product companies have used it to evaluate their commercials. This method may have shortcomings, but it allows them to identify strong or weak commercials and to compare them to other ads.
Advertising Control for Television (ACT), a lab procedure of The MSW Group, uses about 400 respondents representing four cities. It measures initial brand preference by asking participants which brands they most recently purchased. Respondents are then divided into groups of 25 to view a 30-minute program with seven commercials inserted in the middle. Four are test commercials; the other three are control commercials with established viewing norms. After viewing the program, respondents are given a recall test of the commercials. After the recall test, a second 30-minute program is shown, with each test commercial shown again. The second measure of brand preference is taken at this time, with persuasion measured by the percentage of viewers who switched preferences from their most recently purchased brand to one shown in the test commercials.
Figure 19-11 The
AD*VANTAGE/ACT theater methodology
Figure 19-11 The
AD*VANTAGE/ACT theater methodology
On-Air Tests Some of the firms conducting theater tests also insert the commercials into actual TV programs in certain test markets. Typically, the commercials are in finished form, although the testing of ads earlier in the developmental process is becoming more common. This is referred to as an on-air test and often includes single-source ad research (discussed later in this chapter). Information Resources, Ipsos-ASI, MSW Group, and Nielsen are well-known providers of on-air tests.
On-air testing techniques offer all the advantages of field methodologies, as well as all the disadvantages. Further, there are negative aspects to the specific measures taken through the on-air systems. One concern is associated with day-after recall scores, the primary measure used in these tests. Lyman Ostlund notes that measurement errors may result from the natural environment—the position of the ad in the series of commercials shown, the adjacent program content, and/or the number of commercials shown.18 While the testing services believe their methods overcome many of these criticisms, each still uses recall as one of the primary measures of effectiveness. Since recall tests best reflect the degree of attention and interest in an ad, claims that the tests predict the ad's impact on sales may be going too far. (In 28 studies reviewed by Jack Haskins, only 2 demonstrated that factual recall could be related to sales.)19 Joel Dubow's research indicates that recall is a necessary but not sufficient measure, while research by Jones and Blair was even more demonstrative, noting that "it is unwise to look to recall for an accurate assessment of a commercial's sales effect."20
On the plus side, most of the testing services have offered evidence of both validity and reliability for on-air pretesting of commercials. Both Ipsos-ASI and MSW Group claim their pretest and posttest results yield the same recall scores 9 out of 10 times—a strong indication of reliability and a good predictor of the effect the ad is likely to have when shown to the population as a whole.
In summary, on-air pretesting of finished or rough commercials offers some distinct advantages over lab methods and some indications of the ad's likely success. Whether the measures used are as strong an indication as the providers say still remains in question.
Physiological Measures A less common method of pretesting finished commercials involves a laboratory setting in which physiological responses are measured. These measures indicate the receiver's involuntary response to the ad, theoretically eliminating biases associated with the voluntary measures reviewed to this point. (Involuntary responses are those over which the individual has no control, such as heartbeat and reflexes.) Physiological measures used to test both print and broadcast ads include pupil dilation, galvanic skin response, eye tracking, and brain waves:
1. Pupil dilation. Research in pupillometrics is designed to measure dilation and constriction of the pupils of the eyes in response to stimuli. Dilation is associated with action; constriction involves the body's conservation of energy.
Advertisers have used pupillometrics to evaluate product and package design as well as to test ads. Pupil dilation suggests a stronger interest in (or preference for) an ad or implies arousal or attention-getting capabilities. Other attempts to determine the affective (liking or disliking) responses created by ads have met with less success.
Because of high costs and some methodological problems, the use of pupillometrics has waned over the past decade. But it can be useful in evaluating certain aspects of advertising.
2. Galvanic skin response. Also known as electrodermal response, GSR measures the skin's resistance or conductance to a small amount of current passed between two electrodes. Response to a stimulus activates sweat glands, which in turn increases the conductance of the electrical current. Thus, GSR/EDR activity might reflect a reaction to advertising. In their review of the research in this area, Paul Watson and Robert Gatchel concluded that GSR/EDR (1) is sensitive to affective stimuli, (2) may present a picture of attention, (3) may be useful to measure long-term advertising recall, and (4) is useful in measuring ad effectiveness.21 In interviews with practitioners and reviews of case studies, Priscilla LaBarbera and Joel Tucciarone also concluded that GSR is an effective measure and is useful for measuring affect, or liking, for ads.22 While a number of companies have offered skin response measures, this research methodology is not commonly used now, and LaBarbera and Tucciarone believe that it is underused, given its potential.
Continue reading here: Advantages And Disadvantages Of Persuasion Test For Advertisement Effectiveness
Was this article helpful?
Readers' Questions
-
Jari1 year ago
- Reply
-
SARAH1 year ago
- Reply
-
ulpu1 year ago
- Reply
-
tom1 year ago
- Reply