
You are the new crossword puzzle intern for a major math software newspaper, The New York Multiply ("All the News That's Bits and Ints"). Your editor has given you a list of words to use in this week's puzzle. He knows you can't possibly use all the words on his list, so he gave each word a weight and told you to "do the best you can". You happen to know that he gave the same challenge to another intern, so you'd better do a good job if you want to keep your position. Maybe some software would be helpful here...
Note that these rules talk about words for the sake of explanation, but in practice the "words" will be vectors of integers.
Restated, the challenge goes like this: Given a list of acceptable words (a "dictionary") and a grid size, return a grid that maximizes the use of high-value words while minimizing the use of bogus words (those that do not appear in the dictionary).
The scoring software looks at your grid and generates a list of all the words that appear in it by scanning the rows from left to right and the columns from top to bottom. Each word that comes off the puzzle grid is either removed from the dictionary (if it is an acceptable word) or added to the bogus word list (if it is not acceptable). In this way, if a word appears only once in the dictionary, then any subsequent time it appears in the grid it is treated as an illegal word. For example, suppose your dictionary contains the words HELP, SINE, MAP, MATH, and PLOT with the weights as shown on the left below. You might construct the grid shown on the right.

The scoring algorithm is going to read three words going across the rows: MAP, HELP, and MATH. Going down we read not only the approved word PLOT, but also two illegal words: MH and AE. The word SINE, while acceptable, didn't make it onto the board.

You pay for all the unused words left in your dictionary, so a lower score is better. Once a word is used from the list, it is removed from the list of acceptable words. Since SINE is unused and has a weight of 5, that becomes your cost. But we're not done yet. You still have to pay a penalty for all your illegal words. The weight associated with these illegal words changes with each puzzle. Let's say that in this case the penalty is 3. Once this penalty is taken into account, your total cost is now the sum of your dictionary, 5, plus the sum of your bogus words, 2*3 = 6, for a total of 11.
It's easy to see now that adding the word MAP on the top row was a bad idea. It cost more (6 penalty points) than it was worth (4 dictionary points). On the other hand, if the penalty per word had been 1 instead of 3, then MAP would have been a smart play. Better still would have been this grid, which achieves a total cost of 4, since MATH and PLOT are the only two words left in the dictionary, and no illegal words are manufactured.

For complete clarity, in actual practice the above problem would look more like this.
>> words
[1x4 double] [1x4 double] [1x3 double] [1x4 double] [1x4 double]
>> words{1}
72 69 76 80
>> words{2}
83 73 78 69
>> words{3}
77 65 80
>> words{4}
77 65 84 72
>> words{5}
80 76 79 84
>> penalty
3
>> weights
10 5 4 2 2
And the grid you return might look like this.
77 65 80 0
72 69 76 80
0 0 79 0
77 65 84 72
You will write the function solver.m which will be called by the test harness. It must have this signature:
board = solver(words, weights, n, penalty)
For each puzzle, the following score will be calculated:
result = sum(Unused Word Score) + penalty * (Number of illegal words)
Since each of these is to be minimized, the lowest overall score at the end of the contest wins.
Cyclomatic complexity, also known as McCabe complexity, is a measure of the number of independent paths through a program's source code. Typically, as this number gets higher, it becomes more difficult to understand what's happening in a program. This makes it harder to test, modify, and refactor.
Since a file can contain multiple functions, the complexity for any given file is defined as the MAXIMUM complexity of any functions contained in it. A good practice is to keep the complexity for each function below 10, so for this contest your overall score will increase according to the complexity in excess of 10. So there is no complexity penalty for submissions in which all functions have a complexity of 10 or less.
You can measure the cyclomatic (or McCabe) complexity of any function in MATLAB using the "cyc" switch for mlint. Try this, for example:
>> mlint -cyc magic.m
This is a rough measure of how long your code is, but it will not penalize you for comments and variable name length.
t = mtree(filename,'-file'); length(t.nodesize)
Your entry will time out and be disqualified if it takes more than 180 seconds (three minutes). To keep the queue moving smoothly, you are limited to submitting no more than five files every 15 minutes, for an average of three minutes per file. If we find you are creating multiple accounts in order to get around this limitation, we may disqualify all your entries.
The code is limited in size by the database architecture. The column in our MySQL database that stores the M-code is of type text, which is limited of 65535 characters. Submissions longer than this limit will fail.
The files you need to get started on the contest are included in a ZIP-file in a ZIP-file available on the MATLAB Central File Exchange. If you download and uncompress this ZIP-file, you will have all the files you need to get started. You will be writing the file solver.m using the signature described in the Syntax section above. To test this function with the test suite in the ZIP-file, use runcontest.m.
>> runcontest
The contest is divided into three segments. Most of the week will run as usual, with free sharing of code, but the first two days of the contest will hide some of the information about each entry.
Starting and ending times are based on noon in Natick, Massachusetts, but the web pages will show all times in Coordinated Universal Time (UTC). For the exact timing of these three phases, refer to the box at the top right corner of the page.
Once an entry has been submitted, it cannot be changed. However, any entry can be viewed, edited, and resubmitted as a new entry. You are free to view and copy any entry in the queue. If your modification of an existing entry improves its score, then you are the "author" for the purpose of determining the winners of this contest. We encourage you to examine and optimize existing entries.
We also encourage you to discuss your solutions and strategies with others. You can do this by posting to the thread that we've started in our newsreader.
The allowable functions are those contained in the basic MATLAB package available in $MATLAB/toolbox/matlab, where $MATLAB is the root MATLAB directory. Functions from other toolboxes will not be available. Entries will be tested against the latest version of MATLAB.
The following are prohibited:
Entries that compromise the contest machinery are not allowed. Out of consideration for everyone participating in the contest, we ask that you not abuse the system.
Extraction of puzzles in the test suite by manipulating the score, runtime, or error conditions is also forbidden. Tuning the entry to the contest test suite via multiple entries is permitted, but we ask that you not overwhelm the queue.
Check our FAQ for answers to frequently asked questions about the contest.
Contests are divided into segments where some or all of the scores and code may be hidden for some users. Here are the segments for this contest: