Welcome to Fogbow Rebooted. My previous hosting company crashed my server and I couldn't restore the database, so we are starting Fogbow from scratch again. You will have to re-register in order to use the forum, and because of spammers, I will have to approve your membership.
As always, you can dismiss this announcement, little X button top right.

Meaningless Numbers and Missing Data

Post Reply
User avatar
Posts: 463
Joined: Mon Feb 22, 2021 10:23 pm
Location: The Swamp in Victorian Oz

Meaningless Numbers and Missing Data


Post by keith »

Here's the context.

I've got a record collection that includes 1293 albums. It's cataloged on Discogs which provides a 'market value' (min, median, max) for each album that has been sold through the Discogs marketplace.

Discogs values my collection at 47003.00. (I'll use the max value cause it looks more impressive, but I'd never get that much for it of course) - which gives an average value of 36.35 per album.

However, 236 albums have never been sold via Discogs. So my valuation of 47003.00 comes from only 1057 records, giving a per album value of 44.47.
However, also, the most expensive single item in my collection is a box set of 11 albums, so I think I need to actually count it as 11 albums, so that gives an average per album value of 44.05.

Now if I take the calculated average values and plug them in for the albums with no actual observed value, I get an extrapolated collection valuation of $57,497.52 taking the box set as 1 album or $56,958.65 counting the box set as 11 albums.

None of these figures are worth much and I doubt very much that the insurance company would pay any attention to them if the house burns down. I have never liked liked this method of accounting for missing data, and I'm sure there are better ways (there are other sources for market data that may include my albums but that is likely too much hard work). I'm sure I used to do something 'fancy' back in the day, when I was working on metrics for sizing computers and telling the boss that our machine was gonna die next week, but I had lots of stats libraries and canned methods and other crutches to work with and I can't remember what they were anyway.

Anybody know a better methodology to account for and extrapolate missing data?

Code: Select all

All albums in collection           1293
Albums with no value listed         236
Rolling Stones box set count         11

Total Albums in Collection         1293
Albums With values & RSBox=        1057
Albums With values & RSBox=        1067

Collection Value Minimum      13,926.33
Collection Value Median       24,172.31
Collection Value Maximum      47,003.00

                           Average Value per Album
                                   Min           Med       High
Total Albums in Collection        10.77         18.69      36.35
Albums With values & RSBox=       13.18         22.87      44.47
Albums With values & RSBox=       13.05         22.65      44.05

                           Extrapolated Collection Value
                                   Min           Med       High
Total Albums in Collection    13,926.33     24,172.31  47,003.00
Albums With values & RSBox=   17,035.71     29,569.34  57,497.52
Albums With values & RSBox=   16,876.05     29,292.22  56,958.65

Any sufficiently advanced troll is indistinguishable from a genuine kook
Post Reply