NOTE: Here is the spreadsheet for this post if you’d like to follow along. This is gonna be an awesome sheet to have laying around just as a refresher on these techniques. And remember, don’t forget to sign up for the newsletter to hear about future Analytics Made Skeezy tutorials.
What a crazy weekend. Nuts. I’m completely wiped, but I’m sitting here typing, because I’m jet lagged and can’t sleep a wink.
It all started Friday morning with a knock on my bedroom door. Graham was standing there in his tighty whiteys (why does he wear those again?) and mustache looking like some kind of molester. He was rubbing sleep from his eye.
“There’s someone at the door for you,” said Graham.
“The front door?” I asked groggily. The night before had been fun but unkind, and my head felt a little like it’d been taxidermied.
“Yes, Alex. The front door you jackass. What other door would I be talking about?” he said, irritated.
My first thought was Andre. I just knew it was him. Victor’d sent him again. How much math can one drug dealer need help on, I wondered.
“Some older dude in a suit,” Graham added as I rolled out of bed and tossed on some pants, “Looks like he could be your dad, man.”
Oh shit, I thought. This is gonna be the FBI, DEA, who knows who. I’m so screwed.
But it was neither Andre nor the heat. It was Victor. Victor in a three piece suit, looking dapper as hell. He was standing at the door of the frat hourse eying his watch with impatience when I walked up.
“Come on my boy,” he said when he saw me, “We must get going.”
“Huh?” I asked, still a big groggy. It was only just now 6 AM.
“We have a plane to catch,” he said, “Put on some clothes, and grab your passport and wallet. You do have a passport, don’t you?”
“Yeah, I have a passport,” I answered, “But why?”
“I’ll tell you in the car, now go get what you need for the weekend,” he said and turned back to the CLK Black that was idling in front of the house.
I shoved some stuff in a bag as fast as I could, and moments later, we were speeding down 85 toward the airport.
“I must go to Amsterdam on business this weekend, and I have some decisions I need to make before I meet with my suppliers. There’s no time for you to help me here, so you must work with me on the plane,” he said.
“I’m going to Amsterdam?” I asked.
“Have you ever been?” he asked.
“Never left the states except to Canada and the Bahamas,” I said.
“Well, then you should enjoy this,” he laughed and looked over at me, “Consider it a short European holiday.”
Any thought of this being a holiday faded when I passed through security at Hartsfield in the company of an international drug dealer. The TSA agent stopped me, and for a second I thought that surely I’d be arrested for who knows what. But it turned out that I’d left my lighter in my pocket, so I mopped away the bullets I was sweating and followed Victor to the international concourse.
You get this idea from the movies that all drug dealers fly in private jets, but that’s just not the case it seems. I asked him about it.
“Private jets attract scrutiny. I lead a simple life, but I do like to fly first class on international travel,” he said with a shrug.
When we’d boarded, gotten our gin and tonics in order, and leveled out at thirty six thousand feet, Victor removed his laptop from his carry-on.
“So here’s the deal,” he said, “I’m meeting with my main X supplier in Amsterdam, and he wants me to pick a design for my pills from the new ones he’s got rolling of the line.”
“OK,” I nodded.
He opened a spreadsheet and pointed the screen my way.
“These are my options,” he said.
“Wow, ok,” I said, “Kindof random, aren’t they?”
“He makes what he makes. Strange guy,” said Victor.
“So why not just pick one, any one?” I asked, “Like the four leaf clover. That could be cool.”
“If you were a car company would you just release any car? Would you just plop out an Aztek and be done with it?” he asked.
I shook my head no.
“Of course not,” he said, “And in this case, we know for a fact that pill attributes directly affect sales. That’s what my dealers tell me. At the bigger festivals and shows, there’s plenty of competition. I need my pills to stand out.”
“So what are the most important attributes?” I asked.
“Well, I asked that of twenty of my trusted dealers. I had them give me a ranked list of the attributes that mattered when they tried to sell this stuff,” Victor said, “I also had them vote individually on which attribute was most important in each pair of attributes. This is what they gave me.”
He showed me two small tables of data. The ranked list:
“So color is more important than purity?” I asked.
“Druggies like pretty colors. Especially under black light. Very few bring Simon’s regeant with them, and even when they do, they’re willing to accept less than 100% MDMA so long as the additives are benign,” he shrugged.
“Explain texture,” I said.
“If the pill looks bumpy or brittle, people seem to think it’s cheap, like back alley lab cheap,” he answered.
“And timeliness?” I asked.
“Is the pill stamped with an image, shape, or text that provides a timely reference to world events?” Victor said, “So the Obama pill and the Republican pill are both timely since it’s an election season, while the four leaf clover would be better next March for St. Patty’s. People get a kick out of those touches, and it makes the pill seem fresh.”
“So ‘pop culture’ is like the Bart Simpson pill,” I said, “It’s a reference.”
“Yes, and likewise ‘Brand’ just means it gives a brand reference,” he said.
“What about happy?” I asked.
Victor laughed, “Does the pill look happy? This seemed stupid to me, but many of my dealers agreed on this. People like uplifting images on the pill. Something that’s not intimidating.”
“Weird,” I said.
“And here’s the vote data,” he added and pulled up another table, “I had 20 dealers weigh in.”
“OK, so here you’ve got all the pairwise comparisons with vote counts,” I said.
Victor nodded, ”And here I graded my options as best I could based on these attributes.” He flipped over to a new sheet:
“One means yes, zero means no. The MDMA column gives the percentage MDMA. Also, I gave Bart only half a point for his color since it’s not very vibrant,” he said.
Victor took a sip of his drink and looked over at me, “So which one do I pick?”
I laughed, “So what you’re bumping up against here is a topic called multi-criteria decision analysis. How do you make decisions in the midst of multiple, completing objectives such as the oft-cited ‘risk versus reward.’”
“So this is like picking stocks?” he asked.
“To a degree. We’ve got a whole host of criteria here, but we can only pick one pill design. I can tell you up front though that whatever we pick, it ain’t going to be the Optimus Prime, the clover, the smiley face, or the crown,” I said.
Victor furrowed his brow and looked back at his spreadsheet, “Why?”
“Because those four pills are all strictly dominated by other pills. They’re not Pareto efficient,” I said.
Victor’s brow furrowed even further, so I pointed out a few rows in the spreadsheet, “What I mean by that is that if we look at the grades you’ve given here I can see that Bart has everything the Smiley pill has and more. So no matter which attributes are most important, Bart is always going to beat Smiley. So we can ignore Smiley.”
Victor nodded, “That makes sense.”
“So then the question is,” I said, “How do we combine your attribute scores into a single score here?”
“You average them,” said Victor.
I smiled, “Right, but we can’t do a straight average. After all, some attributes are more important. We have to do a weighted average.”
“Sure,” said Victor, “But what weights do we use?”
“For that, we get into a strange pseudo-scientific field of weighting techniques. We know that our weights should sum to 1, and we know that if an attribute is more important than another, then its weight should be bigger.”
“But how much bigger?” Victor interrupted.
“Precisely,” I said, “How should the weights decay as you go down in ranking? It depends. A lot of psychology goes into it. People tend to overweight their lower criteria when buying things only to make the decision almost entirely off their top criterion. For instance, they may say they care about fuel economy right after cup holders and paint color, but when your customer buys a Hummer, you know that that preference didn’t hold much weight.”
“So the weights should decay rapidly?” he asked.
“Fairly rapidly, yes,” I answered, “I wouldn’t give all the weight to ‘Color’ but I wouldn’t give but a tiny bit to ‘Brand.’”
“Got it,” he said, “So where do we start assigning the weights?”
“Well, let’s take a look at three ‘direct weighting’ techniques, and one ‘indirect’ weight technique that will use the vote data you’ve given me instead of the ranking data,” I said.
“Sounds good,” answered Victor.
I set Victor’s computer on the seat tray in front of me and added a column to the attribute ranking where I inverted the ranks:
“We’ll need that in a sec,” I said, “The first technique I want to show you is the simplest. It’s called Rank Sum, and the way it works is that if I come in first place out of eight criteria, I get 8 points out of a total of 8+7+6+5+4+3+2+1 points equal to 36 points. So my weight is .22. If I come in last place I only get 1 out of 36 points so my weight is .03.”
I scribbled out the formula on a napkin:
“This is the calculation for each attribute i with rank ri where in this case K is 8,” I said. In excel it looked like this:
“Those weights don’t taper off very fast,” I said as I dragged down the formula, “In fact, they decay linearly.”
“So what other options are there?” asked Victor.
“Well, on the flip side (pun intended), there’s something called Rank Reciprocal where I get the value of my inverse rank divided by the sum of the other inverse ranks,” I said.
I filled in the column in Excel with this formula:
“Here we get better decay at the beginning,” I said.
“But ‘Happy’ and ‘Brand’ actually count for more at the end,” he said.
“Right, so let’s look at my favorite technique. It’s kindof in the middle of these two. It’s called Rank Order Centroid, although occasionally you’ll hear it referred to as MAGIQ or SMARTER in the literature,” I said and jotted down a new formula on my napkin:
“Just like rank reciprocal, it uses the inverted rank values,” I said, “but this one gives us a decay that’s a bit more agreeable. Not that you care, but the calculation is also a bit better grounded in the natural world. It’s a bastardized center of mass calculation.”
I put the formula in Excel and dragged it down:
I graphed the three different weight columns:
“See how the Rank Order Centroid weights decay quickly and finish low?” I asked.
“That fits pretty well with human psychology. Whenever I don’t have weights to start with in a problem, I go the ROC route,” I said.
“But then what about the votes?” asked Victor.
“Ah ha,” I said, “You’re right. We need to go over indirect weighting.”
I adjusted myself in my seat a little. The cabin was hot, and drinking only made it hotter. My butt was starting to stick to the seat.
“So here’s the deal,” I said, “Asking a group of people like your dealers to create a ranked list can be an inherently flawed question. Group decision-making on deciding the placement of low-ranking criteria is shit. Twenty people cannot collectively decide whether fifth and sixth place shouldn’t be reversed. If there’s a loud person in the room, no one’s gonna fight to make sure their ordering of low-ranking criteria wins.”
“So instead, using pairwise comparisons, like we have in this voting data, keeps the task manageable,” I said, “Furthermore, having votes instead of a pure ‘this is better than that’ gives us some more data we can use.”
I flipped the tab in the spreadsheet over to the voting data, “So we can use something called the analytic hierarchy process or AHP to transform these votes into weights. It’s a bit convoluted though.”
Victor nodded, “I’m trapped on a plane with nothing better to do. Let’s try it.”
“OK,” I said, folding my hands and cracking my knuckles, “You’ve got the winners in each vote in the left column, so what I’m going to go is take the vote difference and then normalize that difference to a score between 1 and 9 where 1 is a tie and 9 is a 20-to-0 vote.”
I plugged this formula into Excel to do the normalization:
and dragged it down. I then created a new tab called ‘AHP’ and created a criteria X criteria grid with 1s on the diagonal, the normalized scores on the upper triangle, and the inverse of the normalized scores on the low triangle like this:
“So this is just a matrix representation of your vote data where a value greater than 1 indicates that the row value is more important and a value less than 1 indicates that the column value is more important,” I said.
“What do we do with it though?” asked Victor.
“We’re going to pull an eigenvector out of it that’s going to be our weights,” I said, “which is a fancy word for a simple computation. The first thing we do is multiply all the elements on a row together and take their eighth root in this case because there’s 8 elements.”
I added in the column to the right of the matrix taking the 8th root of the product of the elements on each row.
“Then I’m going to total these values up and normalize them by their sum, so that the vector adds up to 1. And that’ll be my weight vector,” I said.
“And so the advantage of these weights is that I can elicit them without making the group create a ranking together?” asked Victor.
“So can we see how the various weights perform?” asked Victor.
“Let’s do it,” I said. I moved back over to Victor’s choices, pasted in the weights and took the sumproduct of weights and scores for each pill:
We stared at the results a moment.
“Obama wins for everything but Rank Sum,” said Victor, motioning to the last row, “for that technique, Bart barely wins.”
“Right,” I said, “And that’s because Rank Sum decays slower, so even though Bart’s color is only a half point, he still gets lots of points later. That’s the wrong call in my opinion.”
Victor nodded, “I agree.”
“The rest are quite similar with departures occurring further down the pill ranking,” I said, “And that should tell you that your time is best not spent doing a crapload of complex scoring on pairwise comparisons. Just be careful when you rank your criteria.”
“That makes sense,” said Victor, “I think it’s interesting that some of those points you excluded did well here.”
“Yeah, that’s because while they’re not Pareto efficient, they still had points in the right weighted places. The elephant was Pareto efficient but on low-weighted criteria.”
“Ah ha,” said Victor.
He slammed the lid to his laptop shut and smiled a broad smile, “Green Obama it is! Care for another drink?”
Big Data-ish Tip
Wanting to weight things comes up more often than you’d think in analytics. It’s not exciting, but when you have to do it, it’s nice to have some tools. If you can’t tell, I gravitate toward ROC as a quick-and-dirty technique that provides generally OK results.
It’s terribly common when dealing with KPIs, performance dashboards, and other various Lean Six Sigma garbage to need to weight scores. Doing a weighted average of scores is also one of the few ways to solve a linear programming problem when you’ve got multiple, conflicting objectives. You can always do something like put one of the objectives in the problem as a constraint or something, but if you can shove everything in the objective function, why not?
The government really digs this stuff, especially the military. If you’ve got a ton of primary and secondary objectives, how do you build a simplified dashboard full of red/yellow/green indicators? You’re gonna have to combine some stuff, and these weighting techniques are one way to do it.
Now, if you’re working a problem with tons of data…for instance, let’s say Victor had months of demand data for his pills in a database. In that case, it might be possible to build a CART model or a random forest model of pill attribute versus likelihood to be purchased or demand or something. Then you could use the relative importance of the attributes from the model as weights. That’s how I’d do it if I had the data.