Best Computer Security Method Overlooked By Industry
Wed, 13 Mar 2002 15:55:27 -0500
"A team of Penn State and Iowa State researchers has tested and rated=20
three =93smart=94 classification methods capable of detecting the =
patterns of entry and misuse left by the typical computer network=20
They found that one, called =93rough sets,=94 currently overlooked by =
industry, is the best.
The researchers report that computer security breaches have risen=20
significantly in the last three years. In February 2000, Yahoo, Amazon,=20=
E-Bay, Datek and E-Trade were shut down due to denial-of-service attacks=20=
on their web servers.
The U.S. General Accounting Office (GAO) reports that about 250,000=20
break-ins into Federal computer systems were attempted in one year and=20=
64 percent were successful. The number of attacks is doubling every year=20=
and the GAO estimates that only one to four percent of these attacks=20
will be detected and only about one percent will be reported.
Dr. Chao-Hsien Chu, associate professor of information sciences and=20
technology and of management science and information systems at Penn=20
State, began the study when he was on the faculty at Iowa State=20
The results were published in the current issue (Vol: 32, No. 4) of the=20=
journal, Decision Sciences. His Iowa State co-authors are Dr. Dan Zhu,=20=
assistant professor of management information systems, and Dr. G.=20
Premkumar, associate professor of management information systems, and=20
Xiaoning Zhang, Chu=92s former master=92s student.
=93No network security system or firewall can ever be completely=20
foolproof,=94 Chu says. =93So there is always a need for a =91watchdog=92 =
patrol the network and signal when an intrusion occurs. Commercially=20
available =91watchdog=92 systems depend on traditional statistical=20
techniques. However, the newer =91smart=92 methods promise to have a=20
significant impact on accuracy.=94
Even the cleverest intruder leaves electronic footprints on breaking and=20=
entering a secure computer data network such as bank, medical or credit=20=
records. The new =93smart=94 methods can collect information from a =
of sources within the network, =93learn=94 the patterns typical of a=20
perpetrator trying to gain a level of control similar to that of the=20
people who legitimately operate the network, and make a reasoned=20
prediction about whether the pattern represents intrusion or not.
The team focused on three =93smart=94 approaches, known as data mining=20=
techniques, namely: neural nets, inductive learning and rough sets. All=20=
three data mining techniques can collect information, =93learn=94 and =
Neural nets and inductive learning have previously been used in=20
intrusion detection and research by others has found these methods to be=20=
successful and effective. Chu notes that rough sets, a relatively new=20
approach, has not been applied to intrusion detection.
The researchers say their study is the first to evaluate and compare=20
multiple data mining methods, including rough sets, in the intrusion=20
The researchers report that the rough sets method does not require any=20=
preliminary or additional information about the data and can work with=20=
missing values and less expensive or alternative sets of measurements.=20=
The method can work with imprecise values where a pair of lower and=20
upper approximations replaces imprecise or uncertain data.
It is also able to discover important facts hidden in the data and=20
express them in the natural language of decision rules. A powerful=20
method for characterizing complex multidimensional patterns, rough sets=20=
has been successfully applied in knowledge acquisition, forecasting and=20=
predictive modeling, and decision support.
In their study, the team used data from the privileged program -=FA=20
sendmail, a program in use in virtually every Unix site that has email.=20=
They write, =93The data includes both normal and abnormal traces. The=20
normal trace is a trace of the sendmail daemon and several invocations=20=
of the sendmail program. During the period of collecting these traces,=20=
there are no intrusions or any suspicious activities happening. The=20
abnormal traces contain several traces including intrusions that exploit=20=
well-known problems in Unix systems.=94
The average classification accuracy rate for the three programs was as=20=
follows: rough sets 75.68 percent accurate; neural nets 69.78 percent=20
accurate; and inductive learning 51.16 percent accurate.
In addition, the team found that training the programs on equal amounts=20=
of normal and abnormal sequences leads to better learning and a more=20
accurate classification. Whether the data was represented as binaries or=20=
as integers (neural nets cannot use both), did not significantly affect=20=
They conclude, =93The tremendous growth in the Internet and electronic=20=
commerce has created serious challenges to network security. Advances in=20=
data mining and knowledge discovery provide new approaches to network=20