November 10th, 1998
Maintaining Huge Web Sites

11/10/98 RMIUG Meeting Minutes - Maintaining Huge Web Sites

DAlek Komaritsky called the meeting to order at 7:00 p.m. About 120 people were in attendance.

He introduced himself and the other members of the Executive Committee in attendance: Dan Murray, Tom Bresnahan. Not present were Bryan Buus and Art Smoot.

Alek then opened the meeting to announcements from the floor:

- Darren Powderly of TEKSystems dpowderl@teksystems.com announced that there are job openings with his company. He described TEKsystems as one of the top technical service providers in the United States. Due to their rapid growth, they are constantly in need of technical professionals with information systems skills at all levels -- including Help Desk specialists, field engineers, systems engineers, project managers and consultants. Company URL: http://www.teksystems.com/ Contact information: Darren Powderly: (303) 412-2721 or Matthew Leuch: (303) 412-2743 or mleuch@teksystems.com

- Keith Buchanan keithb@seinesys.com announced that he was looking for web development references and resources including books, web sites, etc. for HTML coding, web design, marketing. Several members of the audience and rmiug-discuss mailing list had suggestions. You can view the mailing list archive (ftp://ftp.rmiug.org/rmiug/rmiug-discuss-archive) for some of these suggestions. Keith plans to publish a web page shortly with the collected resources at http://www.seinesys.com/resources, and he welcomes any additional input.

- Dan Murray dan@rmiug.org announced that 1999 would bring six RMIUG meetings and asked the audience to submit their ideas for meeting topics and speakers.

- Dave Goldhammer from the University of Colorado Information Technology Services department was scheduled to speak, but was out of action due to the flu.

Alek had several copies of the book "Poor Richard's Web Site: Geek-Free, Commonsense Advice on Building a Low-Cost Web Site" by Colorado author Peter Kent (PKent@TopFloor.com). (http://www.poorrichard.com/) These were awarded to deserving audience members. One winner came the furthest to attend the meeting (from Colorado Springs), and the other winner was the person newest to web design.

- Don Milani Don.Milani@Sun.com was the first speaker. He is a Manager with Web Services Engineering at Sun Microsystems in Broomfield. He started his presentation with some information about Sun.

Revenue $9.7 Billion, Fortune 200 Company
26,000 Employees and growing
Global Operations: 250 locations in 90 countries
New Campus in Broomfield will employ up to 4000 employees over the next few years.
Network Computing
Industrial-Strength Chip, OS, Systems & Mass Storage Technology
Java Programming Language

Sun's Information Resources (IR) organization consists of 5 Production Data Centers (PSC) World-Wide. SUN IR handles:

6 Terabytes of PSC Data
4,000,000 E-Mails Per Day
1,200,000 OLTP Transactions Per Day
37,000 Workstations
9,000 Servers
8,000 Remote Access Accounts

Web Services Engineering (WSE) is part of Sun's overall IR organization and develops the technology architectures, infrastructure and core services to enable Sun's Web and E-Commerce business strategy. The term E-Commerce encompasses the terms E-Commerce (Internet storefront) and E-Business (Extranet business-to-business electronic interchange). They deliver secure, robust, low-cost, high-performance, network-based web-hosting and transaction infrastructure bundled with site management and creation tools, which enable end-users to focus on content and functionality without regard to hardware, OS, network infrastructure, or geographic location.

Three of Sun's networks are:

  • sun.com - Sun's information and E-Commerce portal on the public internet.
  • SunWeb - Sun's internal intranet.
  • sun.net - an Internet site that provides Sun employees with secure access to Sun's web-based resources through the public Internet from a Java-enabled, SSL-supported browser. So instead of connecting to Sun's Wide Area Network by dialing in via a modem, remote users connect from any web browser. Once there, they can use Sun's corporate tools such as Sun email, access their calendar, use NameTool, Sun's internal corporate name directory (you need this with 26,000 employees), and access Intranet web sites. Sun.net is being patented and will become a product that other companies can buy.

The WSE Areas of Focus include:

  • Architecture Design
  • Infrastructure Deployment
  • Core Services
These 3 areas primarily break down into the 3 main groups within the WSE organization which has 14 employees: the Architecture group, the Engineering group that does the infrastructure part, and the Services Group. But we all cross pollinate and work within each others areas.

The Architecture Design group consists of three people who concentrate on:

  • E-Commerce
  • Intranet
  • Internet
The Infrastructure Deployment group consists of four people who concentrate on:
  • Cache/Proxy Design
  • Web Server Review & Recommendation
  • Content Mirroring
  • Web Access Optimization
They have the responsibility to build and set up what has been architected.

Core Services concentrates on:

  • SunWeb Operations
  • Intranet Web hosting
  • sun.com technical Operations
  • Web Authoring & Construction Toolkits
  • E-Commerce Support Model
My group consists of 6 Webmasters (actually 5 1/2, one person works part time). So once it's been architected and built we oversee the ongoing services of running it. So my group's primary responsibility consists of services for Sun's Intranet, known as SunWeb, www.sun.com infrastructure webmastering, and the company's centralized web hosting services, known as SunIntraWeb. We provide web mastering services and authoring toolkits and an e-commerce support model.

SunWeb Statistics: 3,600 Intranet web sites 4,000,000 Intranet web pages 6,300,000 page transfers per day 500GB WAN http traffic per day

SunWeb ROI and benefits can be broken down into the following categories:

Savings: ~~~~~~~~~~~

  • Reduces the cost of doing business.
  • Reduced distribution costs: savings on pre-press, printing, packaging, CD pressing, warehousing, shipping, fulfillment, mailing-list maintenance, and archiving costs.
  • Reduced cost of SMI Gatekeeper labor through HTML and graphical SunWeb templates, style guides and tools.
  • Reduced employee web site development time with the SunWeb Construction Kit, templates and style guides.
  • Reduced calls to the Resolution Center.
  • Reduced employee surf-time through web directories and enterprise wide search technologies.
  • Rapid information deployment.

Cost Avoidance: ~~~~~~~~~~~~~~

  • Standard graphics and template avoids logo, copyright and information violations.
  • All documentation online: obsoletes hardcopy information distribution.
  • Avoids costs of division-specific web product evaluations, development, and singular solutions that are not extendable to other Divisions or supportable.
  • Standards avoids loss of Internet/Intranet integrity.

Qualitative: ~~~~~~~~~~~

  • Timely delivery of mission critical information to a world-wide audience.
  • Data mining--the ability to obtain and process information: intake, churn and output, leading to knowledge and wisdom for the organization.
  • Disintermediation: the removal of barriers between an audience (customers, employees, and partners) and the group offering the information or service.
  • Workflow improvements.
  • Increased employee productivity.
  • "24x7" organizational productivity.

How Does Sun Use the Intranet?

  • Repository and Distribution of Corporate Information and News
  • Collaboration and Knowledge Sharing
  • Application Development Environment & Deployment Mechanism
  • Data Feed for Sun's E-commerce Applications
  • Laboratory for Large-Enterprise Web Computing
  • As a laboratory for large-enterprise web computing, this knowledge allows us to drive architecture, infrastructure and support issues in the Intranet, Internet, and E-Commerce spaces.

SunWeb is the default home page for Sun's internal users. Along the left panel of the screen shot are the major corporate resources. To find what you need you can use the search function or the A-Z index.
Q. What search engine do you use?
A. Infoseek.

In general, SunWeb Content can be broken down into 3 major categories: Information, Business Applications, and Web Construction and Publishing.

1. Information:

  • Corporate News and info.
  • Campus maps, directions, and phone numbers.
  • Links to branch offices in major cities and international offices around the world.
  • Complete Human Resources information online.
  • Sun Library Information: all types of industry resources.
  • Extensive Divisional data:
    • All the Divisional web sites
    • Sales & Marketing slide shows
    • Engineering data
    • ...and more! The seven divisions or business units are:
    • Computer Systems
    • Enterprise Services
    • Network Storage
    • Microelectronics
    • Solaris Software
    • Java Software
    • Embedded and Consumer

2. Web Construction and Publishing is handled by my group. We don't really do any content, we mainly enable content production.

3. Business Applications are the interfaces to help Sun run its business. All of them are web based and most are or are becoming Java based. They include the following tools:

  • NameTool - name directory to help figure out the 26,000 employees at SMI.
  • OrgTool - Organizational Chart Tool
  • Schedroom - tool for scheduling conference rooms
  • WebDist: application distribution mechanism, so it's a "Third Party Software Kiosk".
  • ServiceDesk - for reporting problems with your system.
  • Facilities Tool - for requesting any type of Facilities service.
  • SunTEA - Travel Expense Authorization and accounting.
  • SunU: for registering for Sun University classes--our internal training org.
  • A tool to access your personal payroll information (PayCheck) and Benefits information (Galaxy).
  • SurveyTool - create, deliver, and sort your own survey
  • And many more!

WSE SunWeb Operations include:

  • The SunWeb Council
  • Intranet Architecture
  • SunWeb Webmaster Services
  • Search Engine Services
  • FTP Services
  • Web Stats Reporting
The SunWeb Council oversees all Intranet-related issues. It is a cross-discipline team that works together do what is in the best interests of the entire Sun community. Therefore, it is a forum for setting web-relevant guidelines and procedures, airing concerns, sharing ideas, and getting to consensus on the business issues concerning the SunWeb community.

Here are some links to some great resource materials:

  • Multidisciplinary Teams: A Must for Web Development: http://www.computerworld.com/home/emmerce.nsf/all/davecol
  • How to set up your intranet organization and policies: http://www.sun.com/sun-on-net/Sun.Overview/policyindex.html

The Key Players on the SunWeb Council are:

  • WSE - We chair the SunWeb Council. We act as the focal point, facilitators, referee (in the sense of "one to whom a thing is referred" and not meaning "having final authority") and traffic cop (i.e., this issue should go to this sub-committee for study and recommendation, etc.).
  • Gatekeeper community: These are people from the various Sun divisions who act as the central point of contact to the WSE SunWeb Webmasters, and for the content providers in their divisions.

Content Providers include:

  • Employee Communications: provides the SunWeb front page content. It is overseen by the Intranet Editor-in-Chief.
  • Marketing Communications: MarCom provides the Sun branding guidelines which are used throughout Sun's Intranet.
The Technical Architecture and Infrastructure support for SunWeb is spearheaded by Dave. He is helped by the other Webmasters in WSE.

Dave's 24/7 SunWeb Webmaster Services include:

  • Intranet Architecture
  • Search Engine Services: Second most visited site on SunWeb.
  • FTP Services
  • Web Stats Reporting
  • Web server set up and administration
  • CGI scripting
  • .. and learning new video games

Patrick is the webmaster for SunIntraWeb, Sun's Internal Web Hosting Service

  • Everything is handled for the internal customer (except content):
  • Consulting, administration of server, scripting, back-up, etc.
  • Currently 300 accounts at various service levels
  • Service growing at ~10%/month

WSE SunWeb Operations provide web construction toolkits for enabling page and site development. Toolkits include templates, formats, guidelines, resources and library of tools. E-Commerce Construction Toolkits consist of Common Services - reusable development components and frameworks with security designed in. The toolkit contains templates and simplified processes for web authoring, content management, and publishing on SunWeb and SunIntraWeb including information on how to construct web Pages and web sites:

  • Designing
  • Building
  • Testing
  • Publishing
  • Maintenance
plus a library of tools for doing all of the above:
  • HTML Editors
  • How to do HTML Conversion
  • Perl and Java Programming resources
  • How to handle graphics
  • How to work with meta tags and get indexed by Sun's search engine.
  • How to implement stats reporting.
  • Suggested Web Servers
The toolkit contains the resources Sun people will need for getting the job done:
  • How to publish on Sun's Intranet and Internet.
  • The Publishing policies
  • The SunWeb Council
  • How to use the Sun web hosting service.
  • Sun.com editorial board

E-Commerce Construction Toolkits concentrate on "Common Services" that all developers will need in doing E-Commerce applications. Examples of Common Services are things such as Authentication, Session management, etc. These will be developed as reusable development components & frameworks that will operate in an enterprise-wide level, and be adaptable to changing business requirements. Components are packaged functional units (usually classes and objects). These address 40-60% of common services you would have to write anyway in an E-Commerce application. Frameworks define the way in which a system of components interact. We're working hand-in-hand with Sun.Net team to ensure compatibility and interoperability.

SunWeb was named one of 50 top Intranet web sites (out of 700 applicants) in the 2nd annual CIO Web Business awards (July 1st 1998 issue). http://sun.com/smi/Press/sunflash/9807/sunflash.980706.2.html

Sun has over 180 job openings in Colorado, mostly in Broomfield. http://www.sun.com/jobs http://suncolorado.com

Questions from the audience for the Sun team:
Q. Does the WSE exert absolute control over web content?
A. On the intranet side, more guidance than control. On the sun.com side, the sun.com editorial board controls it.
Q. What fraction of corporate cost goes into Intranet?
A. 25 people keep it going. ROI - saves $25 million in publishing costs, mail, and printing. It's hard to measure.
Q. How to organize the content? What about differing opinions?
A. We just redesigned the look and feel for low bandwidth clients.
Q. Is there a policy so every employee can have a web page?
A. It's free form, but E-Commerce is more structured.
Q. Extranet was not a term that was used.
A. It will be called E-Commerce, E-Business.
Q. Define.
A. E-Commerce = storefronts, E-Business = business to business.
Q. What's on SunWeb for employee to employee communication?
A. There are employee communication groups, editor of SunWeb is member of Employee Communication Group. Mostly email lists.
Q. No online forums?
A. Internal Newsgroups.
Q. Can and employee create own groups?
A. Yes. There are probably over 1000.
Q. Are web pages only within Sun or accessible from outside Sun?
A. Just in Sun.
Q. Does the SunIntraWeb use streaming media?
A. Starting to lay networks. Big infrastructure cost. Boeing has it on every desktop, costs major bucks.

- The next speaker was Randall Gaz gaz@xor.com, a Senior Programmer in the Internet Technologies Group at XOR Network Engineering.

XOR is eight years old and has 45 employees. It maintains rmiug.org and the RMIUG mailing lists. The Internet consulting group develops high exposure sites like Sporting News (http://www.sportingnews.com), Golf Online, and SkiNet. These sites are engineered with performance in mind. What are the decisions that affect this?

Responsiveness - making large sites responsive to the user. Why is this important? You don't go back to slow sites. Disney Toy story site was tuned, and the next day it got a 30% increase in hits.

How do I do it? - Pre-generate HTML pages as much as possible. Tune database access, design in pieces by heavy use and light use, tune the operating system and server, plan for adding more machines. You will need more machines. It's more expensive to make machine bigger than use multiple machines.

Pre-generate content - Sporting News publishing system generates daily stories, AP feeds, and their own stories in St Louis MO. They're converted to HTML then transferred to Boulder. It uses an Informix database with complex queries, hyperlinks on player names. The old database was Illustra and page generation took up to two minutes. You can't do that online. What's the tradeoff for pre-generated pages?

You need lots of disk space. 50,000 pages require multiple megabytes. The web server is tuned to send complete files and cache files in the browser, so downloads are faster next visit.

Database driven sites - Just because content is in database doesn't mean that you have to access the database each time. On the catalog site 200 pages are pre-generated so very responsive. Pick a preferred view of the data the first time they see it. Other views are not necessary for each user.

Application development - Break up apps into pieces, some are seen by every user, some only seen once per user. On Sporting News each user has a custom home page with favorite teams and sports. This page is stored on server.

Tune the operating system - Use a modern version of the operating system, less than 2 years old. Use lots of memory to cache. Just use the machine as web server, not also the mail, file server. Dedicate specific machines to specific web sites. Design to split services across multiple machines. Sporting News gets 100,000's to millions of hits per day. The Pumpkin Carving tools web page got two million hits per day just before Halloween.

Q. What about performance?
A. XOR uses the Apache web server. Netscape didn't scale for shared servers, or different configurations for each server so it can't share processes. We use IIS on NT. Turn off host name lookups to help user responsiveness. Lookup is done before sending a page which slows things down. Remove necessary functions, like no server side includes, no user directories. This setup is different from University of Colorado, which has to supply these services. Take out Apache negotiation with MIME types.
Q. How do you decide these?
A. Server side includes were a policy decision. We don't have users, so we didn't need user directories. Other modules were removed based on experience.
Q. What do you do when the customer complains about not having server side includes.
A. We work around it by pre-generating content. Otherwise it's too slow.

Regarding multiple machines - When we changed from one to two servers, some CGI didn't work with more than one machine. Publishing systems separate content from presentation like Ziffnet, CNet, newspapers. You don't need a large machine if designed right. A 486 can saturate a T1 with pre-generated content. We use BSDI UNIX on commodity Intel hardware.

Q. How do you monitor Apache servers?
A. Homegrown Perl scripts, monitor SSL servers.
Q. How do 2 servers communicate?
A. All the HTML is on both machines except standings and other content. DNS redirect finds it on the other machine.
Q. Modern browsers send http host so you don't need a separate address for each machine.
A. We give each one it's own address since we have enough IP for legacy browsers.
Q. Rick Duffy at SuperNet posts browser statistics, and there are still a lot of AOL users. Do you design for their old browser?
A. Most AOL users upgrade to a modern browser.
Q. Does XOR have 7/24 operations on site or on call.
A. Servers are amazingly reliable - 350 days for sportingnews.com. Even NT has been up for 200 days. Pagers are sufficient for now.
Q. Pre-generated pages are better, but what about database updating?
A. On Sporting News, when a story is changed, it's written out to html. Pricing catalogs - do scheduled updates every couple hrs.

After Randall finished his presentation, he was joined on stage by Don, Dave, and Patrick from Sun for more audience questions.

Q. You're at the leading edge of technology, so how do you learn or just keep up? There aren't any books for this, are there?
A. Randall - I'm application directed, and I read perl.com, newsgroups, search the web. Dave - I ask someone we work with, and newsgroups are good too. Internal technical community and internal newsgroups at Sun help. It's easier when it's all on the same platform. Internal customers are more forgiving, and a good resource too.
Q. Extranet - Do you work with vendors or channel people to see how to optimize for external users?
A. Sun.com marketing is in the loop with all the groups that talk to the external world. The sun.com business requirements come out of that process.
Q. How does something nifty becomes a product? Doesn't Sun develop their own tools? Why use Infoseek?
A. SunLabs is developing a search engine. We have a good partnership with Infoseek to get what Sun wants. Sun focus is on building Solaris boxes. Don't want to get spread too thin trying to do everything. - Randall uses Swish, a freeware search engine.
A. Use vi or emacs.
Q. How do you solve the overhead problem from CGI processing, ASP for Microsoft?
A. Randall - Modperl can be integrated into server, but it makes the server huge. 500K vs 2MB without the programs. Some are 6MB. Transitioning to FastCGI layer for Apache, Netscape, IIS. Dave - Different requirements for an intranet, not so much a worry.
Q. Throw more hardware at it instead?
A. Patrick - That's not a problem for Sun. Randall - Fast CGI yields a 400% improvement in throughput without expense of hardware.
Q. What do you do with log analysis? What tools do you use? How are the results used?
A. Dave - Sun.com uses Accrue Insight, and SunWeb is using Wusage. SunWeb doesn't use analysis as much as sun.com does. - Patrick - It's a little difficult to get the proper data. The Internal sites have hierarchical proxy machines so requests don't get to the server. - Randall - XOR uses a heavily modified wwwstat Perl program. Plus raw logs are provided to customers.
Q. Do you look at referral logs?
A. Randall - We don't, but will turn it on for customer. It's lots of useless data. We turn it on periodically for snap shot. - Boulder Community Network uses Analog. It's a good one.
Q. Can you talk about Sun.net?
A. Sun.net is an Internet site that provides Sun employees with secure access to Sun's web-based resources through the public Internet from a java-enabled, SSL-supported browser. Can't tell much about it, but it's a way to get to name search, room scheduling apps, and surf the intranet sites. It's turning into a product so we can't say much. It's great for mobile field force, but not for the heavy duty command line user right now.
Q. Can you telnet.
A. That's coming.
Q. XML requirements?
A. Dave - It's getting there, figuring out where it fits. - Don - eSun is one of our biggest internal customers, and they will drive our requirements for XML. - Randall - Using it for publishing sites when it's easier to use a text editor. Loading XML to database driving templates to create custom pages.
Q. What about Shockwave and PDF?
A. Randall - Designers like Flash replacing Shockwave.
Q. How do you manage link rot?
A. The error page says "Sorry". It's hard to manage four million pages.
Q. You can use a spider to search for bad links.
A. Sure, but then you have to do something with it. Randall - Microsoft has a nice tool, but it like most tools it doesn't work well with 10's of thousands of pages.
Q. Do you use the Netscape browser?
A. Yes and HotJava.
Q. For streaming media, would you use an outside source like Vstream in Boulder?
A. A different group decides that at Sun.
Q. Is there a link to Microsoft jokes at Sun?
A. No.
Q. Flash, Java, trying to figure out if anyone knows Java as an alternative to doing Flash.
A. Randall - Newest flash can use Java to install Flash.
Q. There's a tool from IBM called Hot Media cascading Java, breaks apps into pieces. It's used for virtual walkthroughs and animation. Only 4K to get it started, then feed it GIF's or whatever. It's free. It's called Net Media or Hot Media.
Q. To Sun - Do different departments publish to the Internet, or just the intranet. Do you manage the Internet publishing?
A. Don - Sun.com has an editorial board, and the SunWeb council does the Intranet. Dave - Java. sun.com, www.sun.com are merging.
Q. Does XOR have internal newsgroups?
A. We're not large enough to need it.
Q. Netscape has BadAttitude newsgroups, these got subpoenaed by Microsoft.
A. I bucked this up stairs to see what they will do. Email is forever.
Q. What tools manage permissions for uploading content?
A. All levels from desktop to sun.com are involved. We use staging areas. External publishing by sun.com is controlled, they use documentum. Internal publishing on sun.web varies and id free form. That encourages interesting things.
Q. Do you see increasing customization for E-Commerce?
A. Randall - Not much future in it. The Whole Foods story server personalizes each web visit. The problem is that site maintenance amount explodes. It's too hard. Dave - It's processor intensive. We have it on sun.com, called MySun, and it's coming to sun.net. - Patrick - Europe strict privacy laws raise red flags that would restrict this.
Q. TV ads were raked over coals for injecting subliminal messages. Are there any web subliminals?
A. No, but some Java based ads have a putt putt golf game.
Q. Is that subliminal?
A. No just cool. Draws you in more when it's interactive.
Q. Sun has four million pages. Do you ever need to make global changes?
A. Yes, it takes a long time. Changes trickle down. Individual content providers need incentives to make the changes. Patrick - Can change the source if all are using those templates.

URL's of interest:

http://www.colorado.edu/ http://www.colorado.edu/ITS/ http://www.sportingnews.com/ http://www.sun.com/ http://www.xor.com/

RMIUG appreciates the ongoing support from XOR Network Engineering (http://www.xor.com) for administration of RMIUG's electronic discussion lists & WWW site. Thanks also to NDA (http://www.nda.com) for sponsorship of refreshments for our group.

Respectfully submitted by Tom Bresnahan (tbrez@rmiug.org).

Select a Year

2009 Minutes
2008 Minutes
2007 Minutes
2006 Minutes
2005 Minutes
2004 Minutes
2003 Minutes
2002 Minutes
2001 Minutes
2000 Minutes
1999 Minutes
1998 Minutes
1997 Minutes
1996 Minutes
1995 Minutes
1994 Minutes

Copyright 2004 RMIUG.org, All Rights Reserved