For the demos I created for the Guerrilla Tactics session I am doing at NxtGenUG I decided I was rather bored of the usual test data of Customer 1 …. 99999. So instead adapted a script we have used in the past to anonymise client data.
It is quite a simple script the only parameter is @TotalCustomersPerSex. This can be set anywhere between 1 and 50,000 to give unique names. If you want some duplication or more name you can increase it or mess about with the source data.
The script has been written to target MS SQLServer 2008 although it could be altered to work with earlier versions of SQLServer.
The script does the following:
- Create a customer table
- Populates a temporary table with the 500 most popular surnames in England, Wales and the Isle of Man. (From the very helpful site http://surnames.behindthename.com.)
- Populates another temporary table with the 200 most popular boys names from England & Wales, randomly assigning a number I later use to populate the title.
- The same is then done for 200 girls names. The sources for both of these list was from http://www.statistics.gov.uk.
- I then create and populate a temporary table for males tiles and then repeat the process for females.
- I then have an insert statement selecting from forenames and titles and then cross joining the surnames table. This gives a Cartesian product.
- To populate information such as customer reference, initials, Post code, age, email address and password I extract characters from the forename and surname
I used the RAND function to populate the title field on insert is because this is only evaluated once per query. So if it is used in the final select only one random value would be selected. Not what I wanted.
I hope you find this useful.