11/10/98
RMIUG Meeting Minutes - Maintaining Huge
Web Sites
DAlek Komaritsky called the meeting to
order at 7:00 p.m. About 120 people were
in attendance.
He introduced himself and the other members
of the Executive Committee in attendance:
Dan Murray, Tom Bresnahan. Not present were
Bryan Buus and Art Smoot.
Alek then opened the meeting to announcements
from the floor:
- Darren Powderly of TEKSystems dpowderl@teksystems.com
announced that there are job openings with
his company. He described TEKsystems as
one of the top technical service providers
in the United States. Due to their rapid
growth, they are constantly in need of technical
professionals with information systems skills
at all levels -- including Help Desk specialists,
field engineers, systems engineers, project
managers and consultants. Company URL: http://www.teksystems.com/
Contact information: Darren Powderly: (303)
412-2721 or Matthew Leuch: (303) 412-2743
or mleuch@teksystems.com
- Keith Buchanan keithb@seinesys.com announced
that he was looking for web development
references and resources including books,
web sites, etc. for HTML coding, web design,
marketing. Several members of the audience
and rmiug-discuss mailing list had suggestions.
You can view the mailing list archive (ftp://ftp.rmiug.org/rmiug/rmiug-discuss-archive)
for some of these suggestions. Keith plans
to publish a web page shortly with the collected
resources at http://www.seinesys.com/resources,
and he welcomes any additional input.
- Dan Murray dan@rmiug.org announced that
1999 would bring six RMIUG meetings and
asked the audience to submit their ideas
for meeting topics and speakers.
- Dave Goldhammer from the University
of Colorado Information Technology Services
department was scheduled to speak, but was
out of action due to the flu.
Alek had several copies of the book "Poor
Richard's Web Site: Geek-Free, Commonsense
Advice on Building a Low-Cost Web Site"
by Colorado author Peter Kent (PKent@TopFloor.com).
(http://www.poorrichard.com/) These were
awarded to deserving audience members. One
winner came the furthest to attend the meeting
(from Colorado Springs), and the other winner
was the person newest to web design.
- Don Milani Don.Milani@Sun.com was the
first speaker. He is a Manager with Web
Services Engineering at Sun Microsystems
in Broomfield. He started his presentation
with some information about Sun.
Revenue $9.7 Billion, Fortune
200 Company
26,000 Employees and growing
Global Operations: 250 locations in 90 countries
New Campus in Broomfield will employ up
to 4000 employees over the next few years.
Network Computing
Industrial-Strength Chip, OS, Systems &
Mass Storage Technology
Java Programming Language
Sun's Information Resources (IR) organization
consists of 5 Production Data Centers (PSC)
World-Wide. SUN IR handles:
6 Terabytes of PSC Data
4,000,000 E-Mails Per Day
1,200,000 OLTP Transactions Per Day
37,000 Workstations
9,000 Servers
8,000 Remote Access Accounts
Web Services Engineering (WSE) is part
of Sun's overall IR organization and develops
the technology architectures, infrastructure
and core services to enable Sun's Web and
E-Commerce business strategy. The term E-Commerce
encompasses the terms E-Commerce (Internet
storefront) and E-Business (Extranet business-to-business
electronic interchange). They deliver secure,
robust, low-cost, high-performance, network-based
web-hosting and transaction infrastructure
bundled with site management and creation
tools, which enable end-users to focus on
content and functionality without regard
to hardware, OS, network infrastructure,
or geographic location.
Three of Sun's networks are:
- sun.com - Sun's information and E-Commerce
portal on the public internet.
- SunWeb - Sun's internal intranet.
- sun.net - an Internet site that provides
Sun employees with secure access to Sun's
web-based resources through the public
Internet from a Java-enabled, SSL-supported
browser. So instead of connecting to Sun's
Wide Area Network by dialing in via a
modem, remote users connect from any web
browser. Once there, they can use Sun's
corporate tools such as Sun email, access
their calendar, use NameTool, Sun's internal
corporate name directory (you need this
with 26,000 employees), and access Intranet
web sites. Sun.net is being patented and
will become a product that other companies
can buy.
The WSE Areas of Focus include:
- Architecture Design
- Infrastructure Deployment
- Core Services
These 3 areas primarily break down into the
3 main groups within the WSE organization
which has 14 employees: the Architecture group,
the Engineering group that does the infrastructure
part, and the Services Group. But we all cross
pollinate and work within each others areas.
The Architecture Design group consists
of three people who concentrate on:
- E-Commerce
- Intranet
- Internet
The Infrastructure Deployment group consists
of four people who concentrate on:
- Cache/Proxy Design
- Web Server Review & Recommendation
- Content Mirroring
- Web Access Optimization
They have the responsibility to build and
set up what has been architected.
Core Services concentrates on:
- SunWeb Operations
- Intranet Web hosting
- sun.com technical Operations
- Web Authoring & Construction Toolkits
- E-Commerce Support Model
My group consists of 6 Webmasters (actually
5 1/2, one person works part time). So once
it's been architected and built we oversee
the ongoing services of running it. So my
group's primary responsibility consists of
services for Sun's Intranet, known as SunWeb,
www.sun.com infrastructure webmastering, and
the company's centralized web hosting services,
known as SunIntraWeb. We provide web mastering
services and authoring toolkits and an e-commerce
support model.
SunWeb Statistics: 3,600 Intranet web
sites 4,000,000 Intranet web pages 6,300,000
page transfers per day 500GB WAN http traffic
per day
SunWeb ROI and benefits can be broken
down into the following categories:
Savings: ~~~~~~~~~~~
- Reduces the cost of doing business.
- Reduced distribution costs: savings
on pre-press, printing, packaging, CD
pressing, warehousing, shipping, fulfillment,
mailing-list maintenance, and archiving
costs.
- Reduced cost of SMI Gatekeeper labor
through HTML and graphical SunWeb templates,
style guides and tools.
- Reduced employee web site development
time with the SunWeb Construction Kit,
templates and style guides.
- Reduced calls to the Resolution Center.
- Reduced employee surf-time through web
directories and enterprise wide search
technologies.
- Rapid information deployment.
Cost Avoidance: ~~~~~~~~~~~~~~
- Standard graphics and template avoids
logo, copyright and information violations.
- All documentation online: obsoletes
hardcopy information distribution.
- Avoids costs of division-specific web
product evaluations, development, and
singular solutions that are not extendable
to other Divisions or supportable.
- Standards avoids loss of Internet/Intranet
integrity.
Qualitative: ~~~~~~~~~~~
- Timely delivery of mission critical
information to a world-wide audience.
- Data mining--the ability to obtain and
process information: intake, churn and
output, leading to knowledge and wisdom
for the organization.
- Disintermediation: the removal of barriers
between an audience (customers, employees,
and partners) and the group offering the
information or service.
- Workflow improvements.
- Increased employee productivity.
- "24x7" organizational productivity.
How Does Sun Use the Intranet?
- Repository and Distribution of Corporate
Information and News
- Collaboration and Knowledge Sharing
- Application Development Environment
& Deployment Mechanism
- Data Feed for Sun's E-commerce Applications
- Laboratory for Large-Enterprise Web
Computing
- As a laboratory for large-enterprise
web computing, this knowledge allows us
to drive architecture, infrastructure
and support issues in the Intranet, Internet,
and E-Commerce spaces.
SunWeb is the default home page for Sun's
internal users. Along the left panel of
the screen shot are the major corporate
resources. To find what you need you can
use the search function or the A-Z index.
Q. What search engine do you use?
A. Infoseek.
In general, SunWeb Content can be broken
down into 3 major categories: Information,
Business Applications, and Web Construction
and Publishing.
1. Information:
- Corporate News and info.
- Campus maps, directions, and phone numbers.
- Links to branch offices in major cities
and international offices around the world.
- Complete Human Resources information
online.
- Sun Library Information: all types of
industry resources.
- Extensive Divisional data:
- All the Divisional web sites
- Sales & Marketing slide shows
- Engineering data
- ...and more! The seven divisions
or business units are:
- Computer Systems
- Enterprise Services
- Network Storage
- Microelectronics
- Solaris Software
- Java Software
- Embedded and Consumer
2. Web Construction and Publishing is
handled by my group. We don't really do
any content, we mainly enable content production.
3. Business Applications are the interfaces
to help Sun run its business. All of them
are web based and most are or are becoming
Java based. They include the following tools:
- NameTool - name directory to help figure
out the 26,000 employees at SMI.
- OrgTool - Organizational Chart Tool
- Schedroom - tool for scheduling conference
rooms
- WebDist: application distribution mechanism,
so it's a "Third Party Software Kiosk".
- ServiceDesk - for reporting problems
with your system.
- Facilities Tool - for requesting any
type of Facilities service.
- SunTEA - Travel Expense Authorization
and accounting.
- SunU: for registering for Sun University
classes--our internal training org.
- A tool to access your personal payroll
information (PayCheck) and Benefits information
(Galaxy).
- SurveyTool - create, deliver, and sort
your own survey
- And many more!
WSE SunWeb Operations include:
- The SunWeb Council
- Intranet Architecture
- SunWeb Webmaster Services
- Search Engine Services
- FTP Services
- Web Stats Reporting
The SunWeb Council oversees all Intranet-related
issues. It is a cross-discipline team that
works together do what is in the best interests
of the entire Sun community. Therefore, it
is a forum for setting web-relevant guidelines
and procedures, airing concerns, sharing ideas,
and getting to consensus on the business issues
concerning the SunWeb community.
Here are some links to some great resource
materials:
- Multidisciplinary Teams: A Must for
Web Development: http://www.computerworld.com/home/emmerce.nsf/all/davecol
- How to set up your intranet organization
and policies: http://www.sun.com/sun-on-net/Sun.Overview/policyindex.html
The Key Players on the SunWeb Council
are:
- WSE - We chair the SunWeb Council. We
act as the focal point, facilitators,
referee (in the sense of "one to whom
a thing is referred" and not meaning "having
final authority") and traffic cop (i.e.,
this issue should go to this sub-committee
for study and recommendation, etc.).
- Gatekeeper community: These are people
from the various Sun divisions who act
as the central point of contact to the
WSE SunWeb Webmasters, and for the content
providers in their divisions.
Content Providers include:
- Employee Communications: provides the
SunWeb front page content. It is overseen
by the Intranet Editor-in-Chief.
- Marketing Communications: MarCom provides
the Sun branding guidelines which are
used throughout Sun's Intranet.
The Technical Architecture and Infrastructure
support for SunWeb is spearheaded by Dave.
He is helped by the other Webmasters in WSE.
Dave's 24/7 SunWeb Webmaster Services
include:
- Intranet Architecture
- Search Engine Services: Second most
visited site on SunWeb.
- FTP Services
- Web Stats Reporting
- Web server set up and administration
- CGI scripting
- .. and learning new video games
Patrick is the webmaster for SunIntraWeb,
Sun's Internal Web Hosting Service
- Everything is handled for the internal
customer (except content):
- Consulting, administration of server,
scripting, back-up, etc.
- Currently 300 accounts at various service
levels
- Service growing at ~10%/month
WSE SunWeb Operations provide web construction
toolkits for enabling page and site development.
Toolkits include templates, formats, guidelines,
resources and library of tools. E-Commerce
Construction Toolkits consist of Common
Services - reusable development components
and frameworks with security designed in.
The toolkit contains templates and simplified
processes for web authoring, content management,
and publishing on SunWeb and SunIntraWeb
including information on how to construct
web Pages and web sites:
- Designing
- Building
- Testing
- Publishing
- Maintenance
plus a library of tools for doing all of the
above:
- HTML Editors
- How to do HTML Conversion
- Perl and Java Programming resources
- How to handle graphics
- How to work with meta tags and get indexed
by Sun's search engine.
- How to implement stats reporting.
- Suggested Web Servers
The toolkit contains the resources Sun people
will need for getting the job done:
- How to publish on Sun's Intranet and
Internet.
- The Publishing policies
- The SunWeb Council
- How to use the Sun web hosting service.
- Sun.com editorial board
E-Commerce Construction Toolkits concentrate
on "Common Services" that all developers
will need in doing E-Commerce applications.
Examples of Common Services are things such
as Authentication, Session management, etc.
These will be developed as reusable development
components & frameworks that will operate
in an enterprise-wide level, and be adaptable
to changing business requirements. Components
are packaged functional units (usually classes
and objects). These address 40-60% of common
services you would have to write anyway
in an E-Commerce application. Frameworks
define the way in which a system of components
interact. We're working hand-in-hand with
Sun.Net team to ensure compatibility and
interoperability.
SunWeb was named one of 50 top Intranet
web sites (out of 700 applicants) in the
2nd annual CIO Web Business awards (July
1st 1998 issue). http://sun.com/smi/Press/sunflash/9807/sunflash.980706.2.html
Sun has over 180 job openings in Colorado,
mostly in Broomfield. http://www.sun.com/jobs
http://suncolorado.com
Questions from the audience for the Sun
team:
Q. Does the WSE exert absolute control over
web content?
A. On the intranet side, more guidance than
control. On the sun.com side, the sun.com
editorial board controls it.
Q. What fraction of corporate cost goes
into Intranet?
A. 25 people keep it going. ROI - saves
$25 million in publishing costs, mail, and
printing. It's hard to measure.
Q. How to organize the content? What about
differing opinions?
A. We just redesigned the look and feel
for low bandwidth clients.
Q. Is there a policy so every employee can
have a web page?
A. It's free form, but E-Commerce is more
structured.
Q. Extranet was not a term that was used.
A. It will be called E-Commerce, E-Business.
Q. Define.
A. E-Commerce = storefronts, E-Business
= business to business.
Q. What's on SunWeb for employee to employee
communication?
A. There are employee communication groups,
editor of SunWeb is member of Employee Communication
Group. Mostly email lists.
Q. No online forums?
A. Internal Newsgroups.
Q. Can and employee create own groups?
A. Yes. There are probably over 1000.
Q. Are web pages only within Sun or accessible
from outside Sun?
A. Just in Sun.
Q. Does the SunIntraWeb use streaming media?
A. Starting to lay networks. Big infrastructure
cost. Boeing has it on every desktop, costs
major bucks.
- The next speaker was Randall Gaz gaz@xor.com,
a Senior Programmer in the Internet Technologies
Group at XOR Network Engineering.
XOR is eight years old and has 45 employees.
It maintains rmiug.org and the RMIUG mailing
lists. The Internet consulting group develops
high exposure sites like Sporting News (http://www.sportingnews.com),
Golf Online, and SkiNet. These sites are
engineered with performance in mind. What
are the decisions that affect this?
Responsiveness - making large sites responsive
to the user. Why is this important? You
don't go back to slow sites. Disney Toy
story site was tuned, and the next day it
got a 30% increase in hits.
How do I do it? - Pre-generate HTML pages
as much as possible. Tune database access,
design in pieces by heavy use and light
use, tune the operating system and server,
plan for adding more machines. You will
need more machines. It's more expensive
to make machine bigger than use multiple
machines.
Pre-generate content - Sporting News publishing
system generates daily stories, AP feeds,
and their own stories in St Louis MO. They're
converted to HTML then transferred to Boulder.
It uses an Informix database with complex
queries, hyperlinks on player names. The
old database was Illustra and page generation
took up to two minutes. You can't do that
online. What's the tradeoff for pre-generated
pages?
You need lots of disk space. 50,000 pages
require multiple megabytes. The web server
is tuned to send complete files and cache
files in the browser, so downloads are faster
next visit.
Database driven sites - Just because content
is in database doesn't mean that you have
to access the database each time. On the
catalog site 200 pages are pre-generated
so very responsive. Pick a preferred view
of the data the first time they see it.
Other views are not necessary for each user.
Application development - Break up apps
into pieces, some are seen by every user,
some only seen once per user. On Sporting
News each user has a custom home page with
favorite teams and sports. This page is
stored on server.
Tune the operating system - Use a modern
version of the operating system, less than
2 years old. Use lots of memory to cache.
Just use the machine as web server, not
also the mail, file server. Dedicate specific
machines to specific web sites. Design to
split services across multiple machines.
Sporting News gets 100,000's to millions
of hits per day. The Pumpkin Carving tools
web page got two million hits per day just
before Halloween.
Q. What about performance?
A. XOR uses the Apache web server. Netscape
didn't scale for shared servers, or different
configurations for each server so it can't
share processes. We use IIS on NT. Turn
off host name lookups to help user responsiveness.
Lookup is done before sending a page which
slows things down. Remove necessary functions,
like no server side includes, no user directories.
This setup is different from University
of Colorado, which has to supply these services.
Take out Apache negotiation with MIME types.
Q. How do you decide these?
A. Server side includes were a policy decision.
We don't have users, so we didn't need user
directories. Other modules were removed
based on experience.
Q. What do you do when the customer complains
about not having server side includes.
A. We work around it by pre-generating content.
Otherwise it's too slow.
Regarding multiple machines - When we
changed from one to two servers, some CGI
didn't work with more than one machine.
Publishing systems separate content from
presentation like Ziffnet, CNet, newspapers.
You don't need a large machine if designed
right. A 486 can saturate a T1 with pre-generated
content. We use BSDI UNIX on commodity Intel
hardware.
Q. How do you monitor Apache servers?
A. Homegrown Perl scripts, monitor SSL servers.
Q. How do 2 servers communicate?
A. All the HTML is on both machines except
standings and other content. DNS redirect
finds it on the other machine.
Q. Modern browsers send http host so you
don't need a separate address for each machine.
A. We give each one it's own address since
we have enough IP for legacy browsers.
Q. Rick Duffy at SuperNet posts browser
statistics, and there are still a lot of
AOL users. Do you design for their old browser?
A. Most AOL users upgrade to a modern browser.
Q. Does XOR have 7/24 operations on site
or on call.
A. Servers are amazingly reliable - 350
days for sportingnews.com. Even NT has been
up for 200 days. Pagers are sufficient for
now.
Q. Pre-generated pages are better, but what
about database updating?
A. On Sporting News, when a story is changed,
it's written out to html. Pricing catalogs
- do scheduled updates every couple hrs.
After Randall finished his presentation,
he was joined on stage by Don, Dave, and
Patrick from Sun for more audience questions.
Q. You're at the leading edge of technology,
so how do you learn or just keep up? There
aren't any books for this, are there?
A. Randall - I'm application directed, and
I read perl.com, newsgroups, search the
web. Dave - I ask someone we work with,
and newsgroups are good too. Internal technical
community and internal newsgroups at Sun
help. It's easier when it's all on the same
platform. Internal customers are more forgiving,
and a good resource too.
Q. Extranet - Do you work with vendors or
channel people to see how to optimize for
external users?
A. Sun.com marketing is in the loop with
all the groups that talk to the external
world. The sun.com business requirements
come out of that process.
Q. How does something nifty becomes a product?
Doesn't Sun develop their own tools? Why
use Infoseek?
A. SunLabs is developing a search engine.
We have a good partnership with Infoseek
to get what Sun wants. Sun focus is on building
Solaris boxes. Don't want to get spread
too thin trying to do everything. - Randall
uses Swish, a freeware search engine.
Q. HTML?
A. Use vi or emacs.
Q. How do you solve the overhead problem
from CGI processing, ASP for Microsoft?
A. Randall - Modperl can be integrated into
server, but it makes the server huge. 500K
vs 2MB without the programs. Some are 6MB.
Transitioning to FastCGI layer for Apache,
Netscape, IIS. Dave - Different requirements
for an intranet, not so much a worry.
Q. Throw more hardware at it instead?
A. Patrick - That's not a problem for Sun.
Randall - Fast CGI yields a 400% improvement
in throughput without expense of hardware.
Q. What do you do with log analysis? What
tools do you use? How are the results used?
A. Dave - Sun.com uses Accrue Insight, and
SunWeb is using Wusage. SunWeb doesn't use
analysis as much as sun.com does. - Patrick
- It's a little difficult to get the proper
data. The Internal sites have hierarchical
proxy machines so requests don't get to
the server. - Randall - XOR uses a heavily
modified wwwstat Perl program. Plus raw
logs are provided to customers.
Q. Do you look at referral logs?
A. Randall - We don't, but will turn it
on for customer. It's lots of useless data.
We turn it on periodically for snap shot.
- Boulder Community Network uses Analog.
It's a good one.
Q. Can you talk about Sun.net?
A. Sun.net is an Internet site that provides
Sun employees with secure access to Sun's
web-based resources through the public Internet
from a java-enabled, SSL-supported browser.
Can't tell much about it, but it's a way
to get to name search, room scheduling apps,
and surf the intranet sites. It's turning
into a product so we can't say much. It's
great for mobile field force, but not for
the heavy duty command line user right now.
Q. Can you telnet.
A. That's coming.
Q. XML requirements?
A. Dave - It's getting there, figuring out
where it fits. - Don - eSun is one of our
biggest internal customers, and they will
drive our requirements for XML. - Randall
- Using it for publishing sites when it's
easier to use a text editor. Loading XML
to database driving templates to create
custom pages.
Q. What about Shockwave and PDF?
A. Randall - Designers like Flash replacing
Shockwave.
Q. How do you manage link rot?
A. The error page says "Sorry". It's hard
to manage four million pages.
Q. You can use a spider to search for bad
links.
A. Sure, but then you have to do something
with it. Randall - Microsoft has a nice
tool, but it like most tools it doesn't
work well with 10's of thousands of pages.
Q. Do you use the Netscape browser?
A. Yes and HotJava.
Q. For streaming media, would you use an
outside source like Vstream in Boulder?
A. A different group decides that at Sun.
Q. Is there a link to Microsoft jokes at
Sun?
A. No.
Q. Flash, Java, trying to figure out if
anyone knows Java as an alternative to doing
Flash.
A. Randall - Newest flash can use Java to
install Flash.
Q. There's a tool from IBM called Hot Media
cascading Java, breaks apps into pieces.
It's used for virtual walkthroughs and animation.
Only 4K to get it started, then feed it
GIF's or whatever. It's free. It's called
Net Media or Hot Media.
Q. To Sun - Do different departments publish
to the Internet, or just the intranet. Do
you manage the Internet publishing?
A. Don - Sun.com has an editorial board,
and the SunWeb council does the Intranet.
Dave - Java. sun.com, www.sun.com are merging.
Q. Does XOR have internal newsgroups?
A. We're not large enough to need it.
Q. Netscape has BadAttitude newsgroups,
these got subpoenaed by Microsoft.
A. I bucked this up stairs to see what they
will do. Email is forever.
Q. What tools manage permissions for uploading
content?
A. All levels from desktop to sun.com are
involved. We use staging areas. External
publishing by sun.com is controlled, they
use documentum. Internal publishing on sun.web
varies and id free form. That encourages
interesting things.
Q. Do you see increasing customization for
E-Commerce?
A. Randall - Not much future in it. The
Whole Foods story server personalizes each
web visit. The problem is that site maintenance
amount explodes. It's too hard. Dave - It's
processor intensive. We have it on sun.com,
called MySun, and it's coming to sun.net.
- Patrick - Europe strict privacy laws raise
red flags that would restrict this.
Q. TV ads were raked over coals for injecting
subliminal messages. Are there any web subliminals?
A. No, but some Java based ads have a putt
putt golf game.
Q. Is that subliminal?
A. No just cool. Draws you in more when
it's interactive.
Q. Sun has four million pages. Do you ever
need to make global changes?
A. Yes, it takes a long time. Changes trickle
down. Individual content providers need
incentives to make the changes. Patrick
- Can change the source if all are using
those templates.
URL's of interest:
http://www.colorado.edu/ http://www.colorado.edu/ITS/
http://www.sportingnews.com/ http://www.sun.com/
http://www.xor.com/
RMIUG appreciates the ongoing support
from XOR Network Engineering (http://www.xor.com)
for administration of RMIUG's electronic
discussion lists & WWW site. Thanks also
to NDA (http://www.nda.com) for sponsorship
of refreshments for our group.
Respectfully submitted by Tom Bresnahan
(tbrez@rmiug.org). |