## Microsoft word - 4-tong.doc

*Scientific Inquiry, vol. 9, No. 2, December, 2008, pp. 123 – 130 IIGSS Academic Publisher *

** **

**LOGISTIC MODEL BASED ON TWO-STAGE FCM **
**CLUSTER AND ITS APPLICATIONS IN CURATIVE **
**EFFECT ANALYSIS **
SHAN ZENGa, XIAOJUN TONGa,b*, QIUMING HUANGa

*aDepartment of Mathematics and Physics, Wuhan Polytechnic University, Wuhan 430023; *
*bDepartment of Control Science and Engineering, Huazhong University of Science and Technology, *
(Received July 05, 2007; In final form May 11, 2008)

In view of the fuzzy C- average value algorithm, the starting value is sensitive while the restraining result is easy to

fall into the partial minimum shortcoming. This article proposes two stages of fuzzy C- average value cluster

algorithm. First estimate classification number, select initial cluster center through satisfying the similar proximate

matching. Then carry on the cluster through the fuzzy C- average value algorithm. Because the cluster center has

the statistical characteristic, Gray Logistic model overcomes the inner error of each sample and carry on the

forecast of the cluster center at the same time. We analyze and compare a group of medical data released by the

American AIDS medical service test organization ACTG in order to propose a good method of curative effect

appraisal and forecast.

*Keywords*: Fuzzy C-mean value cluster; Cluster center; Logistic gray forecast model; Curative effect appraisal and

forecast

**1. INTRODUCTION **

In the forecast model, the primary data generally indicates with the observation data or the statistical

data, and these data are frequently influenced by randomness as well as the fuzzy concept people

using in the understanding of observation result and the judgment of observation phenomenon.

Therefore the measured value has the accuracy as well as the fuzziness. When a system complexity

increases, its accuracy reduces. When achieving certain threshold value, the complexity and the

accuracy will mutually repel. This is the large-scale system incompatible principle. The cluster

analysis distinguishes the close degree according to some standard among the things, and categorizes

every close thing, providing the basis for carrying on the analysis and the decision-making. In the

forecast of medical curative effect and local economy development, we will categorize the

multitudinous patients or the area according to the curative effect situation, the economical

development situation. Then we will forecast the curative effect or the local economy development,

which will greatly enhance the forecast precision.

Among the multitudinous classified methods, the fuzzy cluster based on the objective function
method receives universal welcome, namely summing up the cluster as a nonlinear programming problem with a belt restrains and obtaining the fuzzy division and the cluster of the data through the optimized solution. This method is simple and widely applied. And it also may be transformed into the
* Corresponding author, e-mail:

[email protected]
*Scientific Inquiry: A Journal of International Institute for General Systems Studies, Inc. *
*http://www.iigss.net/Scientific-Inquiry/mission.html*
*SHAN ZENG, XIAOJUN TONG, QIUMING HUANG *
optimized question with the help of the nonlinear programming theory in classical mathematics, and can be realized easily on the computer. Fuzzy c average value cluster algorithm (FCM, Fuzzy c-Means) based on the objective function cluster algorithm, established by (Dunn, 1974) and (Bezdek, 1981), is the most perfect and the most widespread. But in this algorithm, we need to determine the classified number as well as estimate the cluster center. There are many ways to determine the classified number. How to find out a simple and fast algorithm waits for further discussion. Cluster center estimate, which relates directly to partial superior or overall superior, is also a hot issue under discussion. We may know from the fuzzy clustering center computation that the cluster center is the weighted average. Thus it has the statistical characteristic, and overcomes the random error among the samples.
(Tong and Chen, 2002) has given one new gray Logistic forecast model. This model is based on
mathematics rationale, and has the perturbation error based on the data. Following several characteristics may be found through computation of examples in this literature: (1) gray Logistic model has the best fitting error, also the best forecast effect. (2) gray Logistic model computation is simple, fast.(3) gray Logistic model does not weight, but the fitting error and the forecast error surpass other results based on other weighted models. And also certain human factor exists in weighting factor selection.
The medical curative effect appraisal and forecast is very important content in clinical test. The
initial symptoms of each patient are different, in addition to patient's age, the body condition, or other disease differences, so different patient medical effect exists certain differences regarding the similar treatment plan. For this, this article applies the fuzzy C- average value cluster law in more than 300 AIDS patients according to CD4 density. We categorize those with similar treatment as a kind, thus overcoming the one-sidedness, which enables us to discover the curative effect rule of certain medicine easily. Then we propose AIDS curative effect forecast and the appraisal plan.

**2. IMPROVING THE ALGORITHM OF FUZZY C AVERAGE VALUE **
** **

The fuzzy c-means clustering algorithm (FCM) is widely used in classifications. An objective function

*Jm* is defined as follows:

*P *= (

*p *,

*p *,.,

*p*
∈

*R * is a cluster center vector;

*A x *−

*p * a kind of distance between the

*k*th vector

*x*
*k* and the

*i*th cluster center vector

*pi*;

*A* is

*S* ×

*S*, the

* *step symmetry decides the matrix;

*m *∈ (1, 2, …, ∞) stands for a smoothing weight.

*Uik* is the membership of the

*k*th data point in the

*i*th class. The cluster criterion for takes is

*J*
Because in matrix

*U* the rows are independent, we have
The limit of this equation with constraint condition is the following equality ∑ µ = 1. Its solution
by using the Lagrange multiplicator law is given below:

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*LOGISTIC MODEL BASED ON TWO-STAGE FCM CLUSTER AND APPLICATIONS *
when

*Ik* = ϕ, µ

*ik* = 0,

*i*
For any

*k*, define set

*Ik* and

*I * as follows,
The objective of the clustering is to minimize the objective function with respect to the partition matrix and cluster center. The Iterative algorithm may solve this kind of optimized problem.
According to formula (4), we may easily discover the characteristic of the cluster center is a kind
of weighted average of sample characteristics. It has the function in the elimination of each class sample characteristic data error function. In view of the fuzzy C- average value algorithm, the starting value is sensitive while the restraining result is easy to fall into the partial minimum shortcoming. This article proposes two stages of fuzzy C- average value cluster algorithm. First estimate classification number, select initial cluster center through satisfying the similar proximate matching. Then carry on the cluster through the fuzzy C- average value algorithm. The concrete computation step is as follows:

** **

Step one: To select the initial cluster center and determine cluster counts c satisfying the similar

proximate matching. Considering that the connotation of cluster contains approximation as well as similarity, therefore we make use of the matching established in literature (Tong and Zhang, 2005), which are based on the following minimum problem:
, (

*A* and

*B* are fuzzy sets).
When

*p* = 2, this question reaches the minimum value
as the matching of two fuzzy sets

*A* and

*B*. This matching embodies the approximation as well as similarity of the two fuzzy sets. We can infer the approximation and similarity correlation degree of the two sequenced set
∑max{

*a *,

*b *}min{

*a *,

*b *}

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*SHAN ZENG, XIAOJUN TONG, QIUMING HUANG *
We have to carry on the data melting processing before the computation of matching to the data just melts processing. This step determines cluster category

*c* (2 ≤

*c *≤

* n*) and the initialization classification matrix

*R*(0).

** **

Step two: By the given cluster category

*c* (2 ≤

*c *≤

* n*) and sample data integer

*n*, the hypothesis

iteration cuts the value ε, the initialization classification matrix

* R*(0), the establishment iteration counter

*b *= 0, uses (3) to compute or renew division matrix [µ

** **

Step three: Use (4) to renew the cluster center matrix

*p*(

*b*+1).

** **

Step four: If (

*p*(

*b*) −

*p*(

*b*+1) < ε, then stop the computation and output the division matrix

*U* as well

as the cluster center

*P*, or command

*b* =

*b* + 1, change to step one to repeat the computation again.
Is obvious by above algorithm, the entire computation process is revises the cluster center and the
classified matrix process repeatedly. The example after proved the improvement algorithm its

astringency enhanced greatly also avoids falling into the partial minimum shortcoming. From this

obtains the classification as well as each kind of representative - the cluster center, we are precisely

based on these cluster center establishment forecast model, on the one hand may achieve the

computation load minor function, on the other hand the cluster center is the weighted average, the

error has certain elimination function to the sample between.

**3. LOGISTIC GRAY FORECAST MODEL **

(Tong and Chen, 2002) has given the following gray logistic model based on the concept of

perturbation: Assuming the primary data is

*y*(0)，

*y*(0)(

*i*) > 0,

*i* = 1, 2, …,

*n*, carry on the production

processing regarding

*y*(0 using the reciprocal transformation, namely:

carry on the following processing as to the sequence

*x*(0) = (

*x*(0)(1),

*x*(0)(2), …,

*x*(0)(

*n*)):
(

*k*) − 2

*x *(

*k*) =

*p *( (1)

*x *(

*k*) +

*x *(

*k *+1) +

*p k *+

*p*
(

*k*) −

*x *(

*k*) =

*p x *(

*k *+ )
1 +

*p k *+

*p f *(

*k*) +

*p *, where
The computational method for the gray logistic model is given as follows: 1. Assuming the primary data is

*y*(0),

*y*(0 (

*i*) > 0,

*i* = 1, 2, …,

*n*, carry on the production processing regarding

* y*(0) using the reciprocal transformation, namely
2. (1) Regarding the absolute error of the original data sequence

*x*(0) = (

*x*(0)(1),

*x*(0)(2), …,

*x*(0)(

*n*)), generally use the smallest two rides solution of the following equations

*x *(

*n *−1) +

*x *(

*n*)

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*LOGISTIC MODEL BASED ON TWO-STAGE FCM CLUSTER AND APPLICATIONS *
where

*p*2,

*p*3,

*p*4, satisfy and

*p*2 = −

*e *, the solution results in the parameter value

*a*, which is called the I law; (2) Regarding the relative error of the original data sequence

*x*(0) = (

*x*(0)(1),

*x*(0)(2), …,

*x*(0)(

*n*)), generally use the smallest two rides solution of the following equations

*x *(2)

*x *(2)

*x *(

*n*)

*n *−1

*f *(

*n *−1) 1

*p *

*x *(

*n *−1)

*x *(

*n *−1)
where

*p*2,

*p*3 and

*p*2 =
, the solution result in the parameter value

*a*, which is called II law.
Regarding the small data quantity, we supplement the following algorithm: (3) When the original data sequence

*x*(0) = (

*x*(0)(1),

*x*(0)(2), …,

*x*(0)(

*n*)) is small, the smallest two rides in (1) and (2) are supposed to consider the influence of the original data

* x*(0)(1). Because when data is small, although

*x*(0)(1) equals

* x*(1)(1), the information represented by

*x*(0)(1) cannot be left out. For example, regarding (2), its smallest two rides equation is

*x *(1)

*x *(1)

*x *(

*n*)

*n *−1

*f *(

*n *−1) 1

*p *

*x *(

*n *−1)

*x *(

*n *−1)
Regarding the above three situations, after the solution results in the parameter value

*a*, use (0)

*be*−

*ak *+

*c * to obtain the parameters

*b* and

*c* through the linear fitting. Thus the original model

** **

4. FORECAST MODEL AND MEDICAL CURATIVE EFFECT ANALYSIS
**BASED ON CLUSTER CENTER GRAY LOGISTIC **

Because patients’ initial symptoms and physical quality differ, its curative effect will also differ regarding the same therapy. Therefore, before the medical curative effect forecast, we first should carry on the classification of the patient according to the curative effect data. We generally divide them into three kinds: good curative effect, general curative effect and bad curative effect. Then we divide the above three into two kinds: good initial condition and bad initial condition, finally we carry on the forecast of the cluster center separately. The forecast after the cluster center classified will be able to reflect the curative effect of some therapy more effectively.
Now taking a group of medical data (China mathematics, 2006) released by the American AIDS
medical service test organization ACTG (Of more than 300 patients, each simultaneously takes zidovudine, lamivudine and indinavir and we test everyone’s CD4 density in every few weeks in milliliter blood quantity. Regarding the given sample, first use the two stage cluster law mentioned in 2 in this essay to compute the cluster center and class number. Then as to each kind, use gray logistic forecast model mentioned in 3 to compute the cluster center. The concrete algorithm is as follows:

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*SHAN ZENG, XIAOJUN TONG, QIUMING HUANG *
Use the standard FCM to carry on the classification according to the CD4 density of more than
300 patients. Take q=1.1, e=0.001 in the FCM computation. After 12 iterations, classified coefficient

*Fc*(

*R*) = 0.8746， the average fuzzy entropy

*Hc*(

*R*) = 0.2673, these results explain the cluster effect well. The classified cluster center is given in the following Table 1:
Regarding the 90 patients whose CD4 densities are greater than 200 , use the standard FCM to carry on the classification. The classified coefficient

*Fc*(

*R*) = 0.8718 and the average fuzzy entropy

*Hc*(

*R*) = 0.2989. The classified cluster center is given in Table 2:
Regarding the 41 patients whose CD4 density increase progressively continuously, use the standard FCM to carry on the classification. The Classified coefficient

*Fc*(

*R*) = 0.8520, and the average fuzzy entropy

*Hc*(

*R*) = 0.2524. The classified cluster center is given in Table 3:
The cluster centers shown in Table 1 and Table 3 adopted the logistic gray forecast, where Table 1 the first kind (

*a* = 0.6395;

*b* = 0.0335;

*c* = 0.0066) leads the model:
Table 1 the second kind (

*a* = 0.7626;

*b* = 0.0081;

*c* = 0.0033) leads the model:
Table 2 the first kind (

*a* = 0.5341;

*b* = 0.0064;

*c* = 0.0024) leads the model:
Table 2 the second kind (

*a* = 0.5526;

*b* = 0.0146;

*c* = 0.0031) leads the model:
Table 3 the first kind (

*a* = 0.5867;

*b* = 0.0652;

*c* = 0.0015) leads the model:

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*LOGISTIC MODEL BASED ON TWO-STAGE FCM CLUSTER AND APPLICATIONS *
Table 3 the second kind (

*a* = 0.3343;

*b* = 0.0069;

*c* = 0.0014) leads the model:

*x *(

*i*) according to the forecast model, and ˆ )
(

*i*) , then compute the initial sequence (0)
(

*i*) ’s absolute error sequence and relative
The computation results are below (* express the forecasted value):
The most persuasive method does not rely on the beforehand-determined standard category
characteristics or category based on feelings when classifying the patient. Moreover, patient's classification contains also this or that phenomenon, thus using the soft division is quite reasonable. The fuzzy clustering method happens to satisfy these requests. Through the classification of above more than 300 AIDS patients, this article constructs the basis to satisfy the similar proximate matching computation classification number, and selects the initial cluster center two stage FCM cluster algorithm, the very good victory FCM cluster algorithm to the starting value sensitive, overcoming the shortcomings of FCM cluster algorithm’s sensitiveness to the original value and tendency to fall into the partial minimum. The obtained cluster center is the weighted average of this kind of sample characteristics, thus it has the statistical characteristic, overcoming the random error among samples and provided the good pretreatment for the establishment of forecast model.
The new gray Logistic forecast model is based on mathematics rationale. Its processing data may
have the perturbation error. This model has the good fitting error and the forecast effect as to S data. Its computation is simple, fast, does not rely on the human factor parameter. Regarding the small data quantity, this article has consummated this model computational method. The medical curative effect happens to satisfy the S data. Using the new gray Logistic forecast model is reasonable.
Because AIDS patients’ initial symptoms and physical quality differ, the curative effect is
different regarding some therapy. As to the FCM cluster method, the more closer to 1 of the Fc, the more closer to 0 of the Hc, the better of the cluster effect. Using the two stage FCM cluster law mentioned in this essay to carry on the precise classification of more than 300 patients, classified coefficient Fc is 0.9851, average fuzzy entropy Hc is 0.0299, showing cluster effect is extremely

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
*SHAN ZENG, XIAOJUN TONG, QIUMING HUANG *
good. Moreover, using the model proposed by this essay, we could overcome the sample random error because the cluster center has the statistical characteristic. Gray Logistic model overcomes the inner error of each sample and carry on the forecast of the cluster center at the same time.
Gray forecast model can carry on the reasonable classification of the samples based on the
improved fuzzy C average value cluster Logistic and carry on the reasonable forecast in view of every

kind of sample. Compared with the traditional forecast, it is new and precise and can be applied in the

forecast of population, economical development and medical curative effect.

**Acknowledgement **

This work was supported by National Natural Science Foundation under Grant 79970025, 60403002

and 30370356 of China, and the plan of Science and Technological Innovation Team of the

Outstanding Young and Middle-aged Scholars of Hubei Provincial Department of Education, and

Hubei provincial Natural Science Foundation under Grant 2004ABA031 and 2005ABA233, and

National Postdoctoral Science Foundation of china (Grant 2004036016), and Foundation of Hubei

Provincial Department of Education Grant 2003X130 and Scientific Research of Wuhan Polytechnic

University Grant 06Q15.

**References **

Bezdek, J., (1981)

*Pattern Recognition with Fuzzy Objective Function Algorithms*. Plenum, New York.

China mathematics constructs lay wire http://www.shumo.com/main/ in 2006 the national model B topic.

Dunn, J C., (1974) “A fuzzy relative of the ISODATA process and its use in detecting compact well-separated

cluster”.

*J. Cybernet* , vol. 3, pp. 32 – 57.
Tong, X. J. and Chen, M. Y., (2002) “Based on grading form gray logistic model”. Control and Policy-Making,
Tong, X. J. and Zhang, S. M. (2005) “Similarity and nearness of fuzzy sets”. Proceedings of 2005 International
Conference on Machine Learning and Cybernetics, 2005, vol. 8, pp. 2668 – 2670.

*Scientific Inquiry, vol. 9, No. 2, December, 2008*
Source: http://www.iigss.net/scientific_inquiry/2008-12/4-Tong.pdf

Group on Immunization Education Society of Teachers of Family Medicine CLINICAL SCENARIO SERIES ON IMMUNIZATION Shingles and Post Herpetic Neuralgia Written by: Donald B. Middleton, MD Department of Family Medicine University of Pittsburgh Revision of this clinical scenario was funded from an unrestricted educational grant from the Pennsylvania Academy of Family Physicians. Th

Jerry Tao, President Victor Klausner, D.O., Vice President Huiwen Zhang, O.M.D., Secretary/Treasurer Farolyn McSweeney, O.M.D., Member STATE OF NEVADA BOARD OF ORIENTAL MEDICINE PUBLIC NOTICE OF BOARD MEETING Tuesday, March 10, 20098 at 6:00 P.M. The Nevada State Board of Oriental Medicine will conduct a public Board meeting on Tuesday, March 10 beginning at 6:00 P.M. The