... And downsides

Though the Google Mini package is undoubtedly high in quality, we can't claim it's completely flawless. As we hinted previously, one of the first real limitations we ran into had to do with security. On both fileservers and webservers, the Mini is currently rather limited in properly handling secured content. The fact that the Google Search Appliance offers many of these capabilities isn't much consolation, because the price jump will be too large for smaller companies to afford, requiring them to either skip the crawling of their secure documents, or else find other solutions.

Our own internal wiki system is a good illustration of this problem. It contains information that's confidential to our lab, so we require users to authenticate before viewing any information. Since this system uses cookie-based authentication, the Mini was unable to crawl any of our wiki pages, which was actually the primary use we had in mind for it.

The Mini's inability to enforce authentication for our samba shares created another obvious problem for our lab, where access to nearly every file would ordinarily require authentication. Having the Mini crawl our fileserver left our documents open for anyone to see, and since the Mini serves the results as direct download links, our own safety measures were rendered ineffective .



Using the Mini to index our fileserver made our secured content downloadable to everyone with access to the Mini.

Luckily, this did not cause too much trouble in our case, since our intranet is only accessible to a limited group of users, but it is still a factor to keep in mind when considering a Mini. Some changes to existing security systems may be necessary to keep sensitive content safe.

The lack of control over the Mini itself also bothered us.. Not only are we completely locked out of the OS itself, but monitoring the system's status is out of the question. Though the administration panel does give us a "System status" page, the info provided here is very sparse, and might as well not have been there at all. We hope Google will implement more detailed monitoring here in the future.



A screenshot containing the full package of the Mini's monitoring tools.

Lastly, we would also appreciate an easier overview of the crawler, and more direct control of it. At this point, any directions it can be given seem to be put into a priority queue of sorts, which does get crawled first, but isn't really clear, and gives you no real feedback on the results of your commands. This may cause some confusion as to whether your commands are really being handled at all, and made us wish we could actually see what the crawler was doing in real time. Granted, there are actually some reporting options available, but we found them rather lacking, and using the wonderful Google Analytics system really puts the Mini's built-in reporting options to shame.

In general, we feel that there's still a lot of room for improvement in the Mini's management console, both in usability and general provision of information. We hope that Google looks into these issues when they release their next update for the machine.

Exploring the Mini's possibilities... Crunching numbers
Comments Locked

19 Comments

View All Comments

  • GhandiInstinct - Friday, December 21, 2007 - link

    lol
  • legoman666 - Friday, December 21, 2007 - link

    I would have expected this product to be a few years old with hardware like that. A prescott? seriously. And no RAID?
  • razor2025 - Friday, December 21, 2007 - link

    It's a search engine appliance. The product's main focus is in its software algorithm, not how "fast" the hardware itself is. Why would it need RAID? Any sane network/system administrator will have this box backed up in regular interval to the backup array / server. RAID != back up and this product doesn't need the file system performance either.
  • legoman666 - Friday, December 21, 2007 - link

    I didn't comment about the prescott and the lack of RAID based on a performance concern. The precott is hot and inefficient, why not get something that uses less power (IE, a C2D) even if it doesn't need the added processing power of a newer chip? That way, they could market it as a effiecient device or green or whatever.

    As for the RAID, I am not talking about RAID0 (technically that's not even raid), I was leaning more towards RAID1 or RAID5. They mentioned in the review that it took 36 hours to crawl to the 50000 document capcacity, I'm sure most people wouldn't want their search function down for 36 hours while the engine reindexes because it wasn't backed up. Not only that, but you'd probably have to send it back to Google for repairs with only a single drive. With 2 in RAID1, if one dies, a replacement could easily be swapped in.
  • razor2025 - Friday, December 21, 2007 - link

    Maybe it's an option you can request to Google. As for your take on RAID, you're still treating it as Backup. It would be must simpler if they had a second backup google mini instead. Look, they're charging you for the license per document, not how many mini you have hooked up. Also, it's in a 1U form factor. I highly doubt they can manage to squeeze in another drive to satisfy your "RAID!" obsession.
  • Justin Case - Friday, December 21, 2007 - link

    Backups take time to restore from. RAID1 means no downtime. It *is* a backup, and one that's available instantly.

    It doesn't replace regular, preferably _remote_ backups, but it's a pretty basic feature of any system designed to have zero downtime.
  • reginald - Wednesday, January 2, 2008 - link

    RAID and backup are two entirely different things. No RAID in the world can protect you against the same things as backups can (handling errors, programs incorrectly overwriting data, etc). And backups can never replace RAID to achieve continuous availability.

    Thinking you need no backups because you have RAID is like thinking you need no seatbelt because you've got insurance. They simply aren't the same.
  • rudder - Friday, December 21, 2007 - link

    Prescott performance aside... as the article mentioned this is a 24/7 device... why use such a toaster of a cpu when Core2Duos would not add a whole lot to the bottom line?
  • Calvin256 - Tuesday, January 1, 2008 - link

    If you're looking at the prices as a consumer, that may be the case, but you need to rememeber that Google/Gigabyte is not you or I. When purchasing in bulk those processors can be VASTLY cheaper than we could ever hope to pay, even when they're in the bargain bin at shadyetailer.com. Things made for consumers can easily be marked up 200-2000%, things made for OEMs might have a 50-100% margin.

Log in

Don't have an account? Sign up now