Bloggers often say "Test Everything!" like it is the solution to every problem. Unfortunately everything is rather a lot. A standard Google text ad has 90 characters to play with (ignoring the display URL). There are 26 lower case letters in the alphabet and 26 uppercase as well as 10 numerals. So testing everything (ignoring punctuation) is 6290 different text ads. This is too many, and the hypothesis space (i.e. all the possible hypotheses to test) is quite small here; when you look at website optimisation or something where there are fewer constraints the numbers get even bigger.
So even those who say "test everything" are doing some sort of filtering on the hypotheses they test.
How is this filtering done? And what is the optimum way of doing it? These are some of the questions that drove Phaedrus mad in the book Zen and the Art of Motorcycle Maintenance. I wish I had an answer to this but the book is suddenly making a bit more sense now.