From Computational Science to Internetics Integration of Science with Computer Science

We describe how our world dominated by Science and Scientists has been changed and will be revolutionized by technologies moving with Internet time. Computers have always been well-used tools but in the beginning only the science counted and little credit or significance was attached to any computing activities associated with scientific research. Some 20 years ago, this started to change and the area of computational science gathered support with the NSF Supercomputer centers playing a critical role. However this vision has stalled over the last 5 years with information technology increasing in importance. The Holy Grail of computational science -- scalable parallel computing -- is still important but is just one supporting component of the Internet revolution. We discuss the emergence of the field of Internetics -- bridging computer science and all application areas whether simulation or information based. Internetics is an exciting field, which seems complete and rich enough to be a lasting interdisciplinary area. Physics and other core science and engineering disciplines used to attract the very best minds but now their popularity is declining. We describe curricula initiatives that can re-invigorate these fields. This curricula turmoil must be addressed by our education infrastructure whose professorial staff find it hard to develop courses to satisfy student and employer interests in times of such rapid change. Distance education is very relevant as it can be used to disseminate expertise to students and teachers in these new areas. All of this has implications for our educational institutions, which could be quite profound.

was not hacking but as interesting an application of the scientific method as studying quarks and gluons.Often research groups in my general area (phenomenology of particle physics) assigned "failed theoretical students" to the "trivial" tasking of using computers to translate real physics (equations) into quantitative comparison with experiment.Again a sizeable fraction of experimental high-energy graduate students did not continue in their field after getting a Ph.D. but rather got jobs exploiting their considerable software skills.Typically these skills had been obtained on the job and the students knew essentially no computer science and had no understanding of the principles underlying the hardware and software of computers.In this world only the "physics" counted.I tried reasonably successfully to show the value of the combination of quality in both computing and physics but my task of course was to show that doing better computing (better algorithms, sophisticated data analysis packages, more simulations at higher performance) led to better physics.
This set of values still dominates science and what I call "computational X" (where X= biology, chemistry, physics) is characterized by the practitioners being almost completely judged by their contribution to the field X.This view has some important positives for surely it is critical to be requirement (i.e.application) driven.On the other hand, the unyielding focus and reward structure of the application drivers means it is often hard to take advantage of new computing techniques as perhaps the immediate "return on investment" can not be justified from the science.For instance one could think that a switch of programming language from Fortran to C++ or Java or laborious conversion of ones simulation to parallel systems would pay off in the long run with better code giving better performance.Often scientists have neither the time nor the skills for such steps.

2: The Initial Interdisciplinary Vision: Computational Science
As computers got more powerful and both department (desktop) and the many supercomputer centers focussed the message, the importance of computing to science became better recognized.In the late 1980's the term "computational science" became popular.It uncontroversially characterized a type of research and less clearly defined an interdisciplinary academic field lying between computer science and science and engineering such as illustrated in figure 1.There is no doubt that application areas are important to computer science (and related fields such as computer engineering and applied mathematics) for the applications motivate much important academic research and industry development.Computers and computer science would look somewhat different in a world where science did not involve partial differential equations and matrices.For instance, very different compiler techniques, memory layouts and parallel algorithms would be needed if genetic algorithms and not differential equations described most of science.John Rice himself both pioneered the field of computational science and wrote some of the best descriptions of both the general principles and particular programs.[Rice:95a] Our program at Syracuse University has a conventional structure.[Fox:91f,Fox:92d] There is no separate department of computational science but rather new degree options constructed from courses in existing academic units.I must be honest and admit that the activity was a failure or rather more precisely stillborn as the launching of the new program coincided with the economic recession of the early 1990's.This had a brutal impact on Syracuse University where the some 20% drop in admitted students put on hold their attempts to start new programs such as computational science.Nevertheless, we did define a reasonable curriculum built around two core courses shown in table 1.

Table 1: Core Simulation Track Computational Science Courses in Syracuse Program
Course Description CPS615 Computer Technology and its future projection, Computer Architecture, Application Motivation, Performance Analysis, Programming Models, MPI, (F90, HPF), Java for Science and practical algorithms such as: particle dynamics, PDE's with CFD as example, Random numbers, Monte Carlo.

CPS713
Detailed studies of 3 computational areas such as Numerical Relativity, Optimization, Computer Graphics, Condensed Matter, and Experimental Physics Data analysis These required courses were combined with application electives (such as computational fluid dynamics or computational physics) and relevant applied mathematics, computer engineering and computer science courses to define the curriculum.As Syracuse abandoned its plans to hire several computationally oriented faculty in a variety of disciplines, the curriculum always lacked richness and broad student interest.

3: A Better Dream: Internetics
As we searched for a better match to Syracuse's new reality, we noticed that the majority of students and indeed the best students did not take science majors but rather entered fields like journalism and public affairs which are information and not simulation based.
Other areas where the technologies of information processing seem more relevant than simulation include medical school (telemedicine), law, management, architecture and education.We had also in the period 1991-93 surveyed New York State industry and realized that classic HPCC simulations were not nearly so important as large-scale information (database) applications.[Fox:92e,Fox:94h,Fox:96b]    First offering of Java Academy for Syracuse area middle and high school students and their teachers.

Spring 1999
Jackson State offers course similar to CPS606 to students at Morgan State using Tango Interactive.

Spring 1999
Java Academy offered using Tango Interactive to students at Boston, Houston, Starkville (Miss.) and Syracuse.Fall 1999 5 th consecutive semester of Tango Interactive distance education classes featured CPS640 and over 40 students at 6 distinct sites.Classes so far offered in such fashion are the Java Academy, CPS406, CPS 606, CPS615, and CPS640.
We define Internetics as an emerging field centered on technologies, services and applications enabling and enabled by worldwide communication and computing grids.So we include computational grids and systems like Globus [Globus:99a] as part of Internetics.Classical parallel computing is of course a special case of a computational grid and also a subset of Internetics.This is illustrated in fig.3, which parodies the oftstated assertion (by me among others) that parallel computing is inevitable.We noted that as the price of computers dropped, one would see an inversely proportional increase in the number of them.We naively assumed the power user scenario on the left with each user gathering more and more CPU's onto themselves, and consequently a pervasive deployment of massively parallel machines.However what actually happened is shown on the right with certainly more computers but also a more or less equal increase in the number of users so that there is relatively modest use of parallel systems.Alternatively one can say that the growth is in loosely coupled distributed computing and not in tightly coupled parallel machines.
The technological drivers for Internetics come from computer communication and information science fields but we define the field to have an applied or interdisciplinary flavor.Internetics is today of growing importance in many application fields such as telemedicine, electronic commerce, digital journalism, education while scientific computing itself making increasing use of information as well as simulation technologies.The applied focus with many totally new and rapidly evolving technologies makes Internetics a unique area.The breadth and lack of restriction in application area allows a lasting definition for the field.

Fig. 3: Ratio of Users and Computers in the two tracks of Computational Science
In sec. 2 and fig. 1, we took the usual definition of computational Science as the interdisciplinary field in between Computer Science and "large scale Scientific and Engineering simulation-based" applications.The corresponding academic application fields include aerospace engineering, biology, and physics etc. Combining figs. 1 and 2, we define Internetics is the interdisciplinary field between computer science and both simulation and information-based applications.Now academic applications corresponding to Internetics are broadened as shown in fig. 2 to include areas such as bioinformatics and public communication.Note that analysis of high energy physics experimental data is an "information" application while turning to theory in this field gives a simulation application such as studies of quarks and gluons by Quantum Chromodynamics Monte Carlo.In university or more precisely research organizations, simulation based applications are probably at least as important as those based on information technologies.However technical computing is a small part of the total computer market (certainly less than 5%) and information applications dominate commercial world.We saw this in parallel computing where adoption was quite good by university researchers but industry uptake has been modest and most commercial parallel computing is focused on databases.Thus as only a few students will stay in academia, it is natural to give Internetics an information flavor.However Internetics still has the essential interdisciplinary character that is so important in computational science.One will find probably find different implementations of Internetics in different universities.At Syracuse, the vast majority of students are aiming at a job in industry and so the information flavor of Internetics is very popular and appropriate.Looking from 1995-1998, enrollment in the graduate simulation track of Computational Science courses at Syracuse has dropped from 30 to 5 per semester; enrollment in the information track has risen from 6 to 100.  3. We added "electives" to the core material in table 3 so that one could choose two from courses such as "Computational Grids", "Electronic Commerce" or "Distance Education" to accompany four technology core courses expanding on material in CPS606, 616 and 640.In this model, CPS714 is viewed as one of the possible electives.Syracuse students will tell you that especially CPS616 covers a mind-boggling amount of material and should be stretched out.The undergraduate curriculum had a similar flavor with a total of four courses surveying the material of table 3 at a lower level.The course CPS606 is offered at Jackson State (CSC499) and Syracuse (CPS406) in a simplified undergraduate version.At the pre-college level, we proposed a basic Web introduction combined with a second course where students could learn Java.We have experimented with the latter course [JavaAcademy:99] where the language is taught very differently from CPS406/606.In our K-12 Java Academy we focus on the use of Java to make simple applets and use these graphical applications to motivate the basic language constructs.Java is attractive as a first programming language for several reasons; one of these is that "Hello World" is a nifty web page rather than a dull piece of printer paper as in C++ or Fortran.Thus at the K-12 level, we want to emphasize those parts of Java that enable students to communicate with their peers through writing of applets.Table 4 describes these additional non-graduate courses we have developed.In concluding this section, we note a possible criticism that Internetics is chasing today's technology fad and has no lasting significance.We agree that particular course material in tables 1 and 3 reflect interests that may change with time.However our definition of Internetics as the interdisciplinary area between computer science and all applications has lasting significance.In some ways, it is better defined than computational science whose scope depends on the changing interest in application fields and in the relative importance of information and simulation to a given field.The completeness of the field of Internetics makes it an attractive academic specialty.

4: Down but not quite out: Physics on the Ropes
Emerging fields such as information technology and biotechnology are competing with traditional fields for (the best) students.In times gone by fields like physics had little trouble in attracting the very best graduate students and in fact around 1980 while chair of the physics graduate admissions committee at Caltech, I had to discourage many excellent applicants.Now physics research is healthy intellectually and has comparable funding to the past but for many students it is no longer the field of choice.Other fields like chemistry, mathematics and aerospace engineering are also seeing a decline in student interest.In fact at Syracuse, where many graduate students are foreign, admission to graduate student in areas like physics is sometimes taken as an opportunity to get a masters degree in computer science or engineering "on the side".This masters degree can lead to a job in the information technology Industry and then to the desired "green card".
Predictions for the future are quite unreliable but one expects a continuing demand for information technology workers and so this shift in student interests will continue and perhaps even accelerate.Uncontroversially the best students will surely find unlimited opportunities for several years to come in the information technology area.Other reasons for traditional fields to examine relevance of information technology come from its increasing role in both experimental and theoretical science.For instance, today a distributed collaborative environment to gather sensor data is as important as new algorithms for simulating the related physical phenomena.Further most information technology jobs will not be in basic systems but in application development.In discussing the impact of information technology on traditional academic disciplines, I will concentrate on physics, as this is my personal expertise.However I believe the comments are valid for related areas like mathematics or engineering.
So physics could try to meet the challenge of information technology head on and compete for the best students with the undoubted excellence of its core program.
However perhaps this will only work at a few schools.We could enrich the field with traditional computational science but this seems unlikely to be enough.Rather if "you can't beat them, join them".Thus physics can meet the competition of information technology by embracing it and including it in physics degrees.Internetics is a key concept here as it captures the interface between physics and IT.There are two models one could consider.Firstly an IT minor within a basic physics/engineering education and secondly the opposite choice -an engineering/physics/math methods minor within an IT education.It is worth noting that physics does give any student a solid technical training and a good understanding of several systems coming from nature.This is surely helpful in studying distributed (computer/information) systems, which are a major challenge today for computer science.
A combination of Physics and a minor in Internetics is an interesting background for many areas such as a systems engineer designing global information systems, an experimental physicist designing new data analysis systems; and a pre-college science teacher.Suppose one compares communication of information using either the Web and more traditional material like books.Authoring the Web material appears to demand a different mix of skills that than of traditional journalism or book writing.In particular the Web offers opportunities for "technical people" who can design and construct illustrations of science involving Java applets combined with numerical algorithms.Again interfacing of experimental physics instrument connected to Web may sometimes be more effective in communicating scientific concepts than streams of beautiful English words and nifty drawings.This suggests that physics departments could contribute to an "Internetics" minor with optional electives in "science communication".At Syracuse University, this idea is attractive as the Newhouse School of Communications gets excellent students.In this regard, we are designing a new course called "Internetics and Communicating Science", which aims to teach the different ways presented by the Internet for communicating science and quantitative ideas to laymen as well as to technically trained people.The course is designed for undergraduate students with interests bridging science and communications; these include prospective science, journalism, and education majors.It will offer an introduction to the tools required to communicate using the Internet, as well as case studies of successful and unsuccessful approaches to communicating science with this new medium.The course will not just describe development of animated pages with Shockwave and Java but also the ways of visualizing information as well as scientific data.This is just one possible synergy between physics and Internetics and is aimed at reaching new students with physics ideas.Simpler is the design of Internetics courses aimed at physics and related majors.These would include specially designed material such as discussions of constructions of Web Portals for scientific research.

5: Even Rocks can Crumble: Changing Enterprise Model for Universities
Our discussion of Internetics highlights the relevance of distance education as we see new and rapidly changing academic curriculum, for which many universities do not have the trained faculty able to give quality courses.Thus one scenario is for a few experts to prepare and deliver the courses which are then taken by many more students in total.One could jump to the conclusion that this would also lead to cheaper education but this is not so clear as the cost of developing rapidly changing course material is high.As a simple example, just keeping up with changes in the Java language in our basic course CPS406/606 is non-trivial as it affects both lectures and more importantly examples and homework.We estimate that one needs around 5 to 10 times the conventional (say 25 students) class size to be able to deliver classes like those in the Internetics curriculum with typical tuition rates.This suggests that instead of one professor teaching 2 to 3 classes per semester, a group of some 3 professors collaborate on a single class delivered to many more(say 250) students.As a typical university doesn't have this many students, some form of distance education seems a good approach so that one can amortize course preparation over several universities.This creates a new sort of collaborative virtual university linking previously independent institutions.We believe that market forces will force universities to address their business model and only those that change can survive.We have also stressed that the changes that are driving the need for and implementation of this new business model are moving with Internet Time.Thus it appears wise for university leaders to address these issues urgently and take steps now to explore and put in place the necessary collaborations and technologies needed for distance education.
We have had quite a positive experience using distance education to teach courses at Jackson State University in Mississippi each semester since fall 1997.[NPACDistEd:99] We have used our collaborative system Tango Interactive to deliver several courses over Internet and Defense high performance network DREN with regular 80-minute classes given twice a week from Syracuse some 1000 miles away.The course material is adapted from the computational science and Internetics classes in tables 1, 3 and 4 including the undergraduate CPS406 and graduate classes CPS615, 616 and 640.In our implementation, the curricula, homework, grading, and facilities were the responsibility of NPAC at Syracuse University.So for instance, parallel computing and database homework was submitted to clusters running MPI or Oracle database engines at Syracuse.There was a Jackson State professor assigned to the class as mentor and the students get Jackson State and not Syracuse academic credit.This is of course what they want as this allows the class to be part of their Jackson State degree.Jackson State is a major historically Black University which graduates many computer science students and our classes were welcome as they offered curriculum material not available through regular Jackson State classes.In fact professors assigned as mentors learnt the material and were able to later offer comparable classes locally.Correspondingly we changed our curriculum as the local expertise changed and always offered classes which did not compete with base courses but were add-ons with "leading edge" material (Web Technology, modern scientific computing) which give JSU (under) graduates skills that are important in their career but not otherwise available.
We have described the lessons of these experiments in two papers [Bernholdt:98a] [Bernholdt:99a] and we will not go into details here.The basic architecture is shown in fig.4, with the Java Tango Interactive server ensuring consistency of collaborative sessions established between Syracuse and several PC's in a laboratory at Jackson State.
We describe below the basic principles of the shared event web-based approach but note here that high bandwidth is not essential.Audio-video conferencing needs a guaranteed 10 (audio) to 100 (video) kilobits per second bandwidth.However most important is quality of service, especially for the audio and the good bandwidth (typically around one megabit/sec) was one way of "buying" this.Note the shared event approach shares the specification of shared objects and not the objects themselves.Thus the Tango Server scales to many participants and the intersite bandwidth need not be large.Fetching the curriculum material from the Web server to each student machine could be a serious bottleneck but this can be drastically reduced either by using a proxy server or Mirror site at the student location.Let us briefly describe the basic concepts underlying our approach to distance education.
Learning is an example of an activity which can be thought of in terms of objects (audio streams when you talk, books, homework, science fair exhibits) worked on alone or together --either between students or students and teacher.Electronic support of learning and hence education and training requires digital support for objects used either individually or by multiple people together.Clearly distributed objects and in particular their object web implementation is the core technology with web pages representing the simplest distributed object.The most important form of sharing is immediate from the web model and corresponds to say the teacher posting a web page and the students studying at a later time.The same model with reversed roles comes when the students place their homework on the web for teacher review.This is termed asynchronous collaboration and is an important component of any learning environment.However we know that more interactive exchanges are also an important part of learning and in the electronic world this corresponds to distributed objects being shared in a way that their state is kept coherent.This synchronous collaboration function is provided by the Tango Interactive system [Tango:99] in our distance education work.Tango Interactive shares the specification of objects and not their realization.So if for instance the web pages are expressed as XML defining their content, then Tango shares this XML form and the pages can be separately rendered on each client using in general different style sheets.This important concept allows one to have collaborating clients with different capabilities; one could be a PC on a high speed network viewing the high resolution version of an image; another client could be on a modem line with modest resolution while a third could be a palmtop device viewing a thumbnail.Similarly visually impaired users could collaborate in the same session with the XML rendered not as an image at all but rather in audio form.Shared web pages are surprisingly powerful for both backend databases and computer simulations can both be viewed through web portals and hence shared using this strategy.
It appears that distance education built around shared web pages (with maybe back end databases viewed through the web) is already viable and it seems likely that the current somewhat rough hardware and software infrastructure will improve so that widespread deployment will be possible in next few years.
There are many important aspects of the business model for universities that are implied by an increasing use of distance education.For instance, we have already suggested a different model for course preparation with several faculty collaborating together on a given course.These form "centers of excellence" which may not even be part of a traditional university.I, for instance, threaten to move to a hermit's cave in the Adirondacks and offer such Internetics courses through a T1 line.In our experience, we see universities still keeping a mentoring role as well providing the living and learning infrastructure.Maybe one model is that university departments are still responsible for designing curriculum but each year, they chose how to provide the necessary courses, which together form the degree that is characteristic of the university.Maybe they would deliver their own introductory physics course but introduction to computational science would come from Purdue; Java language instruction from digitalthink.com[Digitalthink:99]; Internetics from Hermit's Cave Virtual University and Quantum Field Theory from Princeton.This model creates a very competitive environment, which seems to the student's advantage, as it will improve the quality of instruction.As mentioned, it may or may not lead to lower cost of instruction.As an illustration of the value of this approach, note that there appear to be difficulties with computer science instruction at many universities and a significant number of the courses have problems.Some are taught well but the course content has not kept up with the field moving with Internet time; the course should either not be taught or be moved from a mainstream course to an optional track taught occasionally.Another set of courses uses modern textbooks and up to date curriculum but unfortunately the assigned Professor has been unable to keep up with change and both the instructor's understanding and/or course material are inadequate.This scenario is typical of any field with changing curriculum and distance education can address both these difficulties and prepare students better for today's world.Looking at tables 1 and 3, we note that curriculum is naturally packaged in units which are different from those in today's universities.NPAC claims to produce a 2 course sequence in computational science or a 4 course sequence in Internetics.It is these units which are natural for a department to assemble into a complete degree.It is easier to build coherent degrees out of self contained units such as certificates in computational science or Internetics, than individual courses plucked from different sources.
My remarks are not particularly original for there are many universities and commercial companies following this model.In fact probably these ideas are being first implemented in corporate training, which can evolve faster than the university establishment.

Fig. 1 :
Fig. 1: Computational Science Program at Syracuse University as proposed in 1991.

Fig. 2 :
Fig.2: Information Track of Computational Science at Syracuse University

Fig 4 :
Fig 4: Tango Interactive as used for Distance Education between Syracuse and Jackson State Universities So we designed in 1995, a companion to the computational science program focused on information based applications and illustrated in fig.2.Later my collaborator Prof. Xiaoming Li (now at Peking University) introduced the term Internetics to cover the full program illustrated in figures 1 and 2. [Li:99] We already had an approved program in computational science at Syracuse and so we expanded this to have both an information and a simulation track.Naturally this program developed incrementally and this intermittent progress is recorded in table 2.[Fox:99b]Our current core graduate program for the information track is given in table 3 while table 4 specifies courses at pre-college and undergraduate levels.The graduate courses are also suitable for continuing education and were offered in this fashion in 1997 as shown in table 2.

Table 3 : Core Information Track Computational Science Courses in Syracuse Program
Xiaoming Li and I sketched out a complete Internetics curriculum starting at pre-college and with undergraduate and graduate (continuing education) programs.The proposed graduate level certificate included 6 courses with coverage similar to those in table

Table 4 : Additional Internetics Courses developed by NPAC Course Description Java Academy Course
for Middle and High School Students taught every Saturday for 2 hours over a 12 week period PHY300 New course under development aimed at teaching use of Internetics for communicating science (see sec. 4) CPS406 Senior Undergraduate version of the course CPS606 in table 3