Difference between revisions of "MTR Psi Testing"

From UFOpaedia
Jump to navigation Jump to search
(→‎Ethereal Cereal's comments: Test 2: reconfirmation of my formula)
Line 149: Line 149:
  
 
I suggest you set all soldiers' skill to 50 and str to 51 (or any combo that will produce 2550 when multiplied) and see if you get 50% success rates when MCing Mutons of psi str 25.--[[User:Ethereal Cereal|Ethereal Cereal]] 21:03, 3 September 2006 (PDT)
 
I suggest you set all soldiers' skill to 50 and str to 51 (or any combo that will produce 2550 when multiplied) and see if you get 50% success rates when MCing Mutons of psi str 25.--[[User:Ethereal Cereal|Ethereal Cereal]] 21:03, 3 September 2006 (PDT)
 +
 +
=== Test 2 ===
 +
 +
Okay.  I took your savegame, hacked all the soldiers to 255 TUs, Psi Str 50, Skill 77, and all the Mutons to Psi Str 100, Skill 0.  [[media:AS77DS100.zip]]
 +
 +
In 640 MC attempts, I had 13 successes, a 2% success rate, where my formula predicts 1%.  When I upped the Mutons' Psi Str to 101, I had 0 successes in 480 attempts.  The 1% vs. 2% discrepancy might be attributable to the as-yet-undiscovered distance portion of the equation.
 +
 +
Your formula (MC chance = 42% + AS - 1.75xDS) would predict a 42 + 77 - 175 = -56% chance.  Test out the above savegame, tell me what results you experience.
 +
 +
I suggest at some point switching your testing to Panics, as you can repeatedly panic units, unlike MCs.--[[User:Ethereal Cereal|Ethereal Cereal]] 16:08, 4 September 2006 (PDT)
  
 
== 3d: Finalizing results for Soldiers at Psi Strength 0 ==
 
== 3d: Finalizing results for Soldiers at Psi Strength 0 ==

Revision as of 23:08, 4 September 2006

This is a "lab notes" page for my (MikeTheRed) Psi Testing, if anybody wants all the gory details.

Conventions: Numbers such as 95/16 always mean, e.g., psi strength 95, psi skill 16. MC = psionic mind control.

Background

The equations governing psionic success have long been a subject of mystery and debate. If they were better understood, players could compare versus alien stats and know exactly how much Psi Strength and Psi Skill is needed, both as a minimum, and where one is maxxed out versus all aliens.

The Official Strategy Guide supplies psionic equations, but while they are very intriguing, they make no mathematical sense, as if the math operators are typo'd or something.

As part of my previous Experience testing ca. October 2005, I found that a 95/16 soldier directly next to a 25/0 muton had a MC success rate of 49% (1192/2420=49.26%), and that a 95/44 never failed. But I soon began to directly edit UNITREF.DAT experience counters for my tests, and thus didn't pursue in-game psi tests any more.

Ethereal Cereal performed the first true testing in May 2006, arriving at the equations:

Attack Strength (AS) = psi str * psi skill / 50
Defense Strength (DS) = psi str + (psi skill / 5)
Panic Attack chance = 44% + AS - DS
Mind Control Attack chance = 24% + AS - DS

For these equations, one is using the attack strength of one party versus the defense strength of the target.

He had a good number of data points, and the results appeared solid. However, they didn't agree with my one highly-tested point. His equation equals 29% for my 95/16 versus 25/0 situation, whereas I got 49% success.

One potentially important difference is that he had an alien MCing his soldiers, whereas I have my soldiers MCing aliens. Anyway, the results remained puzzling to me, but I didn't have time to do psi testing then. Another potential difference is that he's using the WinCE Gold version, while I'm using the DOS 1.4 version (in DosBox). I'd consider the alien-vs-human target-type proposition to be more likely, because as far as folks can tell, the programmers tried to make an exact replica of the DOS version for Windows (a few bugs notwithstanding). And the target-type proposition is testable, as well (if we have the time, fingers crossed!)

Another reason it'd be good to know psi equations, is because effectiveness clearly decreases with distance. Again, how much strength and skill is needed to "totally rule", even across a large map? No one can say until the equations are deduced.

Test 1: Basics

Graph of Ethereal Cereal's equation results - click to see more details

For my test setup, I made a map with 16 mutons, and had 16 soldiers with psi amps that were directly next to and facing the mutons (zip file of it here).

For this first test, I wanted to keep it simple. I made all aliens 25/0, to compare with my earlier finding. Based on Ethereal Cereal's equations (see inset), if one attribute (Strength or Skill) is 100, the other can be up to 50% before the MC success rate is clipped at 100%. To simplify, half my 16 guys had Strength=100, half had Skill=100, then I chose eight equally spaced points within the range that would be applied to Strength or Skill (whichever was not 100 for those 8 soldiers). I chose this "alternating 100" approach because it would test the symmetry and, if it did look symmetrical, the results could be combined to give them more power (a better correlation coefficient on a regression line).

When choosing the eight points from 0 to 50, I also took into consideration how my finding was different from EC's, by 20 points. But I screwed up here and aimed for 0 to 70+... actually my finding was lower than his, so 0 to 50 would've encompassed all concerns. Anyway, I made my soldiers be 9 to 72 in the non-100 attribute.

Very quickly I saw that the higher half of each stick of eight was MCing 100%. So I limited my testing to the lower half of each stick (one attribute = 9, 18, 27, or 36; other attribute 100). I aimed for 50 trials, but went over by one, so N=51 per soldier. Each soldier only made one MC attempt per game turn (otherwise, it's problematical to have the target right next to them). I used the psi experience counter UNITREF.DAT[84] to determine success, because it's much less error prone than counting manually (this work is SO tedious!). But it counts by 1 for a failed attempt and 3 for a successful attempt, so this shows the success rate:

Successes = (UR[84] - Attempts)/2 = ([84]-51)/2
Success rate = Successes/Attempts = Successes/51

Given 51 attempts per soldier, the results were:

          UNITREF.DAT                                               Pooled
          [57]    [37]     MC    success        Pooled    Pooled   success  
 Soldier   Str     Skl  success    rate         attrib.   success    rate
    01       9     100     20      39.2%            9        39      38.2%
    02      18     100     29      56.9%           18        62      60.8%
    03      27     100     48      94.1%           27        96      94.1%
    04      36     100     51     100.0%           36       102     100.0%
    09     100       9     19      37.3%               
    10     100      18     33      64.7%               
    11     100      27     48      94.1%               
    12     100      36     51     100.0%

Although I suspected that the soldiers with an attribute level of 36 were maxed, I kept testing them, both to be entirely sure they were maxed, and as a check that I counted the number of attempts correctly, my UNITREF approach was working, etc.

The results do appear to be symmetrical, so they can be combined as shown in the pooled results above. N=102 for each of the four points, but the attribute=36 point is probably clipped (above the maximum) so it should be dropped from regression analysis. A regression line then drawn for success rate versus attribute level for the three points above (with the other attribute fixed at 100) gives the following for y=mx+b equations, where y is percent MC success, x is attribute level, and m is slope of line:

Int=0?     m         b         R2     100% Intercept
  no     3.10%     8.50%     0.9877     29.5
 yes     3.51%     0.00%     0.9682     28.5

In other words, when one attribute (psi strength or skill) is held to 100, every point of increase of the other skill increases MC success by 3.1% (if the regression line is not forced to go through success=0 at attribute=0) or 3.5% if the y intercept is set to 0. Either approach (intercept=8.5% or forcing it to zero) is problematic, however, because the game is liable to be clipping if values fall below zero, and Attack Strength minus Defense Strength has the potential to result in negative numbers which get clipped.

The 100% success ceiling is reached when the non-100 attribute equals ~29. But that's only an estimate, because it's an estimate of a line that's not perfectly correlated. The finding of psi 100/36 giving 100% success (N=102) is in agreement, though... it's probably somewhere around 29.

The coefficient of variability (R squared) is high.

Clearly, MC is more effective for me (i.e., lower attributes needed) than predicted by EC's equations. Until such time as it can be further understood, then, it appears that aliens are different somehow versus X-COM soldiers, such that they are less effective for a given psi strength and skill.

Test 2: Constants and Multiplication

For the next test, I wanted to make sure that the underlying assumptions are as we expect them to be:

  1. What is the constant in the MC equation? (EC believe it's 24%)
  2. Is multiplication correct for Attack Strength? (Or might psi skill or strength contribute in an additive or subtractive way?)

These were addressed by:

  1. Setting the muton targets to psi strength of 0. (They're already skill 0.) This has the effect of making any MC attack constant "stand clear" of any potential obfuscation caused by misunderstanding the Defense Strength.
  2. Setting soldiers' psi strength to 0, and varying their skill. If possible, I would have interlaced strength 0 and skill 0 for soldiers, but a skill of 0 disables psi capability. Anyway, setting one of the two variables in the Attack Strength equation (see Background above) to 0 should make it all equal 0, if it only uses multiplcation or division (but not if it uses Skill with addition or subtraction somehow).

The soldiers were put into groups of four based on psi skill, to increase sampling and allow pooling at those points. Psi skills and results were:

                        Percent
Soldier  Skill Success  Success (30 attempts each)
  01       1     12     40.0%
  02       1     10     33.3%
  03       1     10     33.3%
  04       1     11     36.7%_
  05      50     13     43.3%
  06      50     12     40.0%
  07      50     13     43.3%
  08      50     11     36.7%_
  09     100     12     40.0% 
  10     100     16     53.3%
  11     100     15     50.0%
  12     100     17     56.7%_
  13     255     11     36.7% 
  14     255      9     30.0%
  15     255     14     46.7%
  16     255     15     50.0%_

This can be summarized:

 Psi Skl Successes   N       Min      Ave+/-SDs      Max
     1      43      120     33.3%    35.8%  3.2%    40.0%
    50      49      120     36.7%    40.8%  3.2%    43.3%
   100      60      120     40.0%    50.0%  7.2%    56.7%
   255      49      120     30.0%    40.8%  9.2%    50.0%  _
 Overall   201      480     30.0%    41.9%  7.7%    56.7%

There is considerable variation; more samples sure would help. But this work is SO tedious!

Although the success rate for the 1-50-100 skill progression hints at a trend, Skill=255 is back down to the level of Skill=50 (and the single lowest soldier success rate, a 9, is in this group). I see no clear difference between the groups, especially if you consider that Skill=255 should be MUCH higher than the lower groups, if Skill were indeed influencing the groups. So, while the data is messy, I buy that it's a multiplicative equation, at least insofar as Skill is concerned.

The best guess at the constant is 41.875%, although it could easily be 40% or something else, with so much variability present. Notice how Test 1 could be taken to hint that the constant is 33.5 (alien psi strength of 25 plus b intercept of 8.5). However, there is considerably more error in an extended regression line... there's a fair amount of variability in the slope, which gets compounded by extending it down to zero. In any event, it also hints at a fairly high constant.

As things become clearer, the data from Test 1 may be able to be pooled with other data in order to pin down the constant better, but for now, 41.875 is the best guess.

Psi Test 3: Focus on the Constant

As I thought about what to do next, it occurred to me that I didn't really like/trust that high constant (42%). So I decided to test it more.

3a: Double-check fails

If the constant is 42%, then mutons at 25/0 should get MCed at least some, even if soldiers are set to Strength 0. Specifically, the equations predict it should be approx. 17% of the time (~42% - DS 25%). So I set half my soldiers to 0/1 and the other half to 0/255, thinking to also double-check whether multiplication matters, as I went.

But it quickly became clear that something was wrong. Not a single MC worked, in a total of 147 MC attempts (split about evenly between the two groups). At 17%, I should've seen ~25 successful MCs. So something is wrong with the equation (or my understanding of it, or my testing). So now I'm backing up to see what's the lowest level for a muton's psi strength, where I am able to MC some. This might give important insights into the equation(s).

3b: Playing with muton Psi Strength, with soldiers at Psi Strength 0

First off: Did setting mutons to strength 0 (in Test 2) somehow totally muck up the equations? To test this, I'm setting some mutons to 1/0, and some to 12/0 (halfway to 25/0). Soldiers will stay at half 0/1 and half 0/255.

These results show that:

  1. For one thing, MCs ARE seen, for mutons at both 1/0 and 12/0. So Test 3a must have hit a "ceiling" - somewhere between 12/0 and 25/0, soldiers can no longer MC when at psi strength=0. This also means that 0/0 mutons don't do weird things due to being 0/0 (read on)...
  2. Soldier psi skill continues to appear to not matter if their psi strength is 0: at skill=1, MC% is 31.1% ± 12.0% (N=280); at skill=255, MC% is 34.3% ± 15.0% (N=280). While the MC% is a little higher at skill=255, there's a huge overlap (i.e., little or no difference), whereas psi skill itself has changed in the extreme (from 1 to 255). Further, one of the soldiers with skill=255 actually had the lowest MC% value (10.0% vs 17.5% for skill=1; N=40 per individual soldier). At this point, I'm pretty satisfied that MC Attack uses multiplication, at least insofar as psi skill is concerned. Probably psi strength, too, but you can't make psi skill go to 0 to directly test that.
  3. Conversely, muton psi strength definitely makes a difference. With mutons at 1/0, we see MC% 44.3% ± 5.1% (range 37.5 to 52.5%, N=280), and at 12/0, MC% is 21.1% ± 5.9% (range 10.0 to 27.5%, N=280). Notice how even the extremes do not overlap - the 12/0 max is 27.5%, and the 1/0 min is 37.5%. Also note how the percent chance is decreasing as 25/0 is approached. The 41.9% ± 7.7% seen with a 0/0 muton (Test 2) is also roughly in agreement with this (it's close to the 44.3% ± 5.1% seen with 1/0).

At first glance, the results seem to be suggesting that muton Defense Strength is increasing by roughly 2 times their psi strength. I say this based on the equation:

MC% = Constant - Defense Strength    (Attack Strength is presumably zero when soldiers are Strength 0)

If MC% goes from 40+ to 0 on the way from muton psi strength 0 to 50, but the actual muton psi strength is only 25... you get the idea. A regression line through the three(!) points (0/0, 1/0, and 12/0) gives MC% = -1.8895*Strength +43.932, where Strength = muton Psi Strength (R2=.9715). Although it's only based on three points so far, it looks intriguingly like:

MC% = 44 - 2 * Muton Psi Strength    (when Attack Strength is zero)

Perhaps this new wrinkle will help explain some of the differences seen.

Ethereal Cereal's comments

Given the formulas I discovered, with Psi str = 0 (Attack Strength = 0), base MC chance is 24 - Defense strength. So Mutons, at Psi str 25, give chance = -1%, which would explain the ceiling you experienced.

I used your savefile and hacked the soldiers to Str 0 and the Mutons to Str 21. I was able to succeed in 6 of 160 MC attempts: 3.75%, slightly above the 3% my formula would predict. I used a simple testing method: I made sure each of the 16 flanking soldiers MCed exactly twice, and counted the number of controlled Mutons at the end of each turn. That might be a faster method to use than yours while still being fairly error-proof.

(Re-reading your above text, I see that MCing once per soldier is better -- necessary, even -- when success rates are high. But counting controlled Mutons once per turn may be faster than counting Psi Skill counters.)

I suggest you set all soldiers' skill to 50 and str to 51 (or any combo that will produce 2550 when multiplied) and see if you get 50% success rates when MCing Mutons of psi str 25.--Ethereal Cereal 21:03, 3 September 2006 (PDT)

Test 2

Okay. I took your savegame, hacked all the soldiers to 255 TUs, Psi Str 50, Skill 77, and all the Mutons to Psi Str 100, Skill 0. media:AS77DS100.zip

In 640 MC attempts, I had 13 successes, a 2% success rate, where my formula predicts 1%. When I upped the Mutons' Psi Str to 101, I had 0 successes in 480 attempts. The 1% vs. 2% discrepancy might be attributable to the as-yet-undiscovered distance portion of the equation.

Your formula (MC chance = 42% + AS - 1.75xDS) would predict a 42 + 77 - 175 = -56% chance. Test out the above savegame, tell me what results you experience.

I suggest at some point switching your testing to Panics, as you can repeatedly panic units, unlike MCs.--Ethereal Cereal 16:08, 4 September 2006 (PDT)

3d: Finalizing results for Soldiers at Psi Strength 0

Graph of MC% Success when Soldiers are at Psi Strength 0

Thanks for the comments, EC. Your data fits right into the graph I've made (inset).

I've done more testing including Test 3c (not shown per se) and 3d. 3c put more points on the graph for a 6/0 muton (MC% 25.8% ± 11.3%, range 10.0% to 45.0%, N=240) and a 19/0 muton (MC% 12.5% ± 3.2%, range 7.5% to 17.5%, N=280). Then I added your point at 21/0 (MC% 3.75, N=160) and, using your cogent comment about multiple MCs, I jacked the soldiers to 255 TUs (allows 10 MCs per turn) and set half the mutons to 23/0 and the other half to 24/0. With the expected MC% very low, I could go to town with the testing. At 23/0, I saw MC% 1.6% ± 1.3%, range 0.0 to 3.4%, N=704. The 24/0 mutons never got MCed (N=704), so that looks real solid.

The 6/0 data point seems low, but what can you say... that's sampling for you.

Doing multiple MCs per turn when the expected rate is very low seems like a great idea, and testing went much faster. When you only do one MC per turn, a large percent of your time is spent simply selecting each soldier, psi amp, and target... this really speeds it up. But in practice, I wonder if it might be problematic. Several times, my soldiers got more than one MC in a turn, which made me use other nearby mutons... but if the success rate is very low, and the decrease due to distance is also real low, it's conceivable that these two might collide and e.g. farther mutons may get what very little percent chance of success they have, obviated by the additional distance.

In order to control for this, one could stop MCing with a particular soldier, if they have a success. Then keep track of how many times they didn't do "all" their MCs, and in the end, subtract those missed chances from the number of attempts for the soldier. This is one way to avoid a potential problem, and still lets you move pretty fast (much more often than not, they don't have any success even with ten attempts, when rates are this low).

All in all, I like the speed boost due to testing "low ends" of expected ranges. So perhaps I can think of ways to use this in my next tests.

Test 3 Summary

It seems pretty solid that the MC Attack equation is using multiplication, at least insofar as the soldiers' psi skill is concerned. No clear effect due to varying psi skill from 1 to 255 was ever seen, when soldiers' psi strength was pegged at zero.

It also seems clear that Defense Strength actually works as a multiple (probably target psi strength times 1.75) instead of being used "straight up" as the initial belief was.

Finally, the constant for MC Attack chance appears to be ~42.

The MC equation can therefore be tentatively modified at this point to be:

Mind Control Attack chance = 42% + AS - 1.75xDS

Although in truth at this point it can't be discerned whether that 1.75 should be placed in the MC equation per se, or directly in the DS equation. The difference being that it would affect Panic success also, if placed in the DS equation.

Ethereal, I'm not sure why you are still calling the constant 24%, when it seems pretty clear that it's higher. (Compare the MC% that 24% gives for AS=0 and DS=0, versus the inset graph.) Let me know if I'm missing something!

Next I guess I will probe the lower edges of the Attack Strength and Defense Strength equations, to verify that they operate as believed. I could shoot for 50% like you suggest EC, but I want to go with the faster low-success approach for now. (It's such a refreshing change!) FWIW, I still prefer using the Unitref psi experience counter, because it also lets you know if you somehow messed up and, e.g., skipped a soldier during testing, or got your number of attempts wrong because of running on for one more (or one less) turn than you wanted. Also, counting MCed aliens every turn introduces some time spent every turn (with a slight potential for error in writing down findings), whereas the Unitref check only happens once. It's not hard to simply leave EDIT running in the target directory, then re-open Unitref again (shrug).

Psi Test 4: Varying Attack and Defense Strengths at Low MC Percent

For these tests, I will vary soldiers' psi strength and skill, and alter mutons' psi strength so that a very low MC% is expected (1 to 3%). Muton psi skill will be kept at 0. A low expected MC% lets me MC much faster.

4a: A stab in the dark

I used my equation above, and pulled soldiers at 10/100 or 100/10 out of the air, versus mutons at 35/0. This should have produced a 1% MC rate.

But testing quickly showed that this produced a higher rate than expected. Specifically, 7.5% ± 5.8% (N=88). This is too high for speedy low-end testing, so I stopped after only one round. Back to the drawing board.

---MikeTheRed 15:32, 4 September 2006 (PDT)