Not too long ago, I studied a modified Collatz rule where
$$f(x)= \begin{cases} 3x+5, & \text{if $x$ is odd} \\ x/2, & \text{if $x$ is even} \end{cases}$$
by observing the trajectories of $n$ with some code I wrote. The code would calculate the trajectory of each seed or starting number $n$ beginning with $1$ until the trajectory reached a loop. The code will then dump the loop into a spreadsheet and then repeat the process for $n+1$ until some defined limit for $n$ was reached. The resulting spreadsheet contains every starting number and the loops each of those numbers ended up in. I did not record the original trajectories in the spreadsheet.
In this Google Document, I created pie charts for the sample sizes 100, 1,000, 10,000, 100,000, and 1,000,000.
The results were made by defining some sample size up to some number, sorting all of the numbers based on what loop their trajectories entered, and then creating ratios for those relationships.
Here is a link to the raw data my code generated:
https://drive.google.com/drive/folders/0BzfYa_--3heeNkVpd1NPd090aDA?usp=sharing
(note: Viewing the 10,000 sample size worked just fine for me, however you would need to download the sample sizes 100,000 and 1,000,000 to view them)
The results show the percentages vary quite a bit from sample to sample, however in the general scheme of things the data seems to be somewhat consistent. For example, my data shows the 19 loop is the end of roughly half the trajectories of the numbers in the samples. Only one percentage never changed from sample to sample; unsurprisingly, the 20-10-5 loop consisted of 1/5 of all tested values.
I am unsure if this “loop bias” I observed is a consequence for relying on a sample size to begin with, human/code error, or if there is a mathematical explanation for what makes certain loops more popular than others. I have a few ideas for why some bias occurs, however I am not confident in them, mostly because my ideas heavily rely on speculation I do not know how to prove formally.
EDIT: Here are the loops in order of appearance:
- [1, 8, 4, 2, 1]
- [19, 62, 31, 98, 49, 152, 76, 38, 19]
- [5, 20, 10, 5]
- [23, 74, 37, 116, 58, 29, 92, 46, 23]
- [187, 566, 283, 854, 427, 1286, 643, 1934, 967, 2906, 1453, 4364, 2182, 1091, 3278, 1639, 4922, 2461, 7388, 3694, 1847, 5546, 2773, 8324, 4162, 2081, 6248, 3124, 1562, 781, 2348, 1174, 587, 1766, 883, 2654, 1327, 3986, 1993, 5984, 2992, 1496, 748, 374, 187]
- [347, 1046, 523, 1574, 787, 2366, 1183, 3554, 1777, 5336, 2668, 1334, 667, 2006, 1003, 3014, 1507, 4526, 2263, 6794, 3397, 10196, 5098, 2549, 7652, 3826, 1913, 5744, 2872, 1436, 718, 359, 1082, 541, 1628, 814, 407, 1226, 613, 1844, 922, 461, 1388, 694, 347]
EDIT 2:
I agree that smaller numbers may be responsible for skewing the data. Therefore, I picked the sample size 100,000 to 1,000,000 to test this theory. I uploaded the results to the original Google Doc with the other pie charts.
I was surprised to find, well the same chart. The ratios were slightly different as usual, but aside from that I am unsure to conclude if this test debunks the hypothesis or iterates the small number problem. I could try different sample sizes, however I do not know if that is a good idea or not.
To provide some insight on what I think is going on, I will show you a digital version of some notes I sketched and explain where my speculations came from.
In May, I drew some sketches of trees and made some speculations about what I observed. I assumed if a loop had a branch or a tail coming from the original even numbers in the loop, then the loop would connect to more numbers. I also assumed smaller even multiples (if $n$ is odd, then an even multiple is $n*2^a$, where $a$ is any value) branching to multiples of three "restricted" the size of the loops.
Of course, none of these statements are objective, much less provable. I wanted to share them in case there were any interesting mathematical patterns occurring or if this information shed light on anything...
Here is a digital version of my sketches.
Note: the trees are built using the "reverse Collatz method" or "${(n-1)}/{3}$, or in this case, an adapted version of that method. To divide $n$ by 2, go one number left. To multiply $n$ by 3 and add 5, find the bottom of the "T", which points to the next even number.
Warning: I showed this to a friend and the tree sketch confused them. If you find this sketch confusing, let me know and I will re-draw the entire thing with arrows instead.
Key:
- If an even number branches, It will have a "T" above it. The first odd number on the "T" is the resulting odd number after applying ${(n-5)}/{3}$. The following even numbers are the even multiples of the odd number. (ex. In the 19 loop, 38 will have a "T" over it. 11 is the resulting odd number, and the even numbers after 11 are $11*2^1$, $11*2^2$, $11*2^3$, ...
- Blue numbers are members of a loop.
- Red numbers followed by a "no" sign are multiples of 3.
- Purple "T"s connect the loop.
- Green "T"s emphasize the extra "tail" or branch.
- Orange "T"s emphasize where a tail could have been, but the number branched to a multiple of 3 instead.
- Arrows connect the separated ends of the loop.
- "..." are used to convey numbers not shown.
I color-coded the sketch to draw attention to certain properties. I figured it would make it easier to understand.
I'm not sure the best way to communicate. I'd give you my email address, if you're comfortable working that way.
– G Tony Jacobs Aug 22 '17 at 16:11