Author Topic: [Bmax] Fastest way to select x many random elements from a list.  (Read 4461 times)

0 Members and 1 Guest are viewing this topic.

Offline Pixel_Outlaw

  • Pentium
  • *****
  • Posts: 1382
  • Karma: 83
    • View Profile
Recently I've become bored with the usual coding and needed to make something new. I have a Tlist of about 23981 english words in my program and need a very fast way to select x many of them randomly.

Currently I'm removing random elements until the list is the correct length. This takes about a minute to do on my computer and uses about 75% of the processing power to get a final length of 20 words. I think it may be faster to shuffle the list and then select the first x many elements. Or maybe something else.


Here is my current code (trimming the list link by link)
Code: [Select]
' randomly select x many words from list
Local starting_length:Int = word_list.count()

Repeat
ListRemove(word_list,TLink(word_list.valueatindex(Rand(0,starting_length-1))))
starting_length:-1
Until starting_length = selection_size
« Last Edit: February 13, 2009 by Pixel_Outlaw »
Challenge Trophies Won:

Offline Pixel_Outlaw

  • Pentium
  • *****
  • Posts: 1382
  • Karma: 83
    • View Profile
Ok I was modeling the problem wrong once again. Just had an blond moment.

Rather than deleting what I don't need I should just pick our what I do need.

Code: [Select]
' set up a temp list for chosen words
Global temp_list:TList = New TList


For Local i:Int =0 To selection_size-1
Local index:Int =Rand(0,starting_length-1)
ListAddLast(temp_list,word_list.valueatindex(index))
ListRemove(word_list,TLink(word_list.valueatindex(index)))
Next

word_list=temp_list
Challenge Trophies Won:

Offline hellfire

  • Sponsor
  • Pentium
  • *******
  • Posts: 1294
  • Karma: 466
    • View Profile
    • my stuff
As you're using a linked list, you can only access the next element.
To pick an arbitrary index, every element needs to be touch until the element is found.
So what you've got to do to get it fast is parse your list *once* and fetch all the elements you need.
To do that, create random indices and sort those.
Iterate your list to the next index using something like this:
(I'm not sure about the syntax, I don't use BlitzBasic)
Code: [Select]
it:TLink
it=list.firstlink()
for i=0 to (curIndex-prevIndex)
  it=it.nextlink()
Each time fetch the element at "it":
Code: [Select]
Print String(it.value)
However, if the number of elements to be picked from the list is significantly smaller than the size of the list, a list is not the best data-structure for this task.
If the list stays constant, use an array.
If the list changes, use a search tree.
« Last Edit: February 13, 2009 by hellfire »
Challenge Trophies Won:

Offline zawran

  • Sponsor
  • Pentium
  • *******
  • Posts: 909
  • Karma: 67
    • View Profile
I was just thinking the same, that you really shouldn't need to delete, but only pick what few you need. But is there are reason why you want to use a list and not an array for this one?

Here is an example of using an array, which is really fast and easy:

Code: [Select]
Const wordlistsize:Int = 20

Global wordlist:String[wordlistsize]

For Local i:Int = 0 To wordlistsize-1
wordlist[i] = "Argh:"+i
Next

SeedRnd(MilliSecs()) ' randomize the randomizer :)

Print "~nBefore shuffle ..."

For Local ii:Int = 0 To wordlist.length-1
Print wordlist[ii]
Next

Local time:Int = MilliSecs()
ShuffleStringArray(wordlist)

Print "~ntime to shuffle: "+(MilliSecs()-time)+"ms"

Print "~nAfter shuffle ... picking the first five elements>"

For Local iii:Int = 0 To 4
Print wordlist[iii]
Next

' lets remove 5 elements from the front
ShortenStringArray(wordlist,5,1)

Print "~nAfter removing elements just picked..."

For Local iiii:Int = 0 To wordlist.length-1
Print wordlist[iiii]
Next


' ShuffleStringArray() function to shuffle an arrays elements
' Parameters:
' array = the array to be shuffled
' times = the number of times to shuffle the array (3=default)
Function ShuffleStringArray(array:String[] Var, times:Int = 3)
Local tmp:String
Local size:Int = Len(array)
For Local i:Int = 0 To size-1
Local r:Int = Rnd(size)
tmp = array[i]
array[i] = array[r]
array[r] = tmp
Next
End Function

' ShortenStringArray() function to shorten an array by any number of elements
' Parameters:
' array = the array to be shortened
' amount = the number of strings to remove
' first = determines if it removes from the front or the back of the array, 0=back(default),1=front
Function ShortenStringArray(array:String[] Var,amount:Int,first:Int=0)
If first Then
array = array[amount..] ' remove from the front of the array
Else
array = array[..array.length-amount] ' remove from the back of the array
End If
End Function

I haven't tried the same using lists, as my first thought was using an array for something like that.

Offline Pixel_Outlaw

  • Pentium
  • *****
  • Posts: 1382
  • Karma: 83
    • View Profile
Thanks guys. I have it picking out the words very fast now.
Challenge Trophies Won:

Offline zawran

  • Sponsor
  • Pentium
  • *******
  • Posts: 909
  • Karma: 67
    • View Profile
Is it some kind of game you are working on?

Offline Pixel_Outlaw

  • Pentium
  • *****
  • Posts: 1382
  • Karma: 83
    • View Profile
Well, yes and no.

My mother is a 1st grade teacher (and 1st rate too) and she needs a way to incorporate her weekly spelling lists into a word search puzzle. I decided to feature a list of words too in case the kids ever need busy work. She can input her spelling list or simply let the computer provide some words. I used to like them as a kid (before the notion of an algorithm for finding the words killed the magic).

This is going to be like my maze and Perlin noise programs, just a learning challenge. Doesn't have to be the most well written software ever just something to keep my on my toes as a programmer.
Challenge Trophies Won:

Offline zawran

  • Sponsor
  • Pentium
  • *******
  • Posts: 909
  • Karma: 67
    • View Profile
I always have a bunch of small projects going to keep me coding while I wait for an idea for a big project. Yesterday I decided to give opengl 3d a second look, so thats what I am working on right now. :)