I manage a few sites which are quite old, and where the assets folder contains over a gig of files which the client has been uploading into for years.
What i'm looking for is a module which lists all the files in the assets area, the file sizes, date uploaded, and how many times this file is used (eg, linked to either through has_one with a database, or linked into the html content of a page)
Then, either through script, or manually, the client would be able to delete their old files, and make the whole site a lot easier to manage.
SS3 seems to track images used in UploadFields but in HTMLEditorFields SS3 images and files are inserted as HTML with their URLs. It would be good if these were replaced with database IDs like how page links are done so it had a means of saying if / where / how often an image was used.
I'm just going off the top of my head so it's totally hacky, but I guess you could do a custom controller that loops through all file objects, then searches all the content fields (on Page + any custom DataObjects with HTML Content) for a version of the $Name field stripped of the extension. Uou could then check and relations as well. If the search returns no results it deletes the object. Not an efficient function but theoretically you'd only run it occasionally, and probably manually.
Like I said hacky, but hey I'm writing this while eating lunch :)
Well I was thinking of throwing something like that together this weekend. I'm moving a clients site from shared hosting (DreamHost) to a Media Temple VS, if I don't have to move the chuff it'll be much better. If I'm successul I'll post my code. My worry is that on shared hosting that many calls may choke hard limits on memory.
Could probably just start by getting the first 100 files or so, and finding any links batch by batch, or else like you say it could overload stuff if there are thousands of files searching through thousands of pages...