The Estimation Models used data of Behavioral Risk Factor Surveillance System (BRFSS [1]) survey. In total the analysis examined survey data made on 1,497 counties (including the 644 counties in the CDC diabetes belts [2] in 16 states in the US.

The statistical models used in this analysis are aimed at reducing estimation error of diabetes prevalence in direct estimation methods, so as to help an efficient policy formulation and budget allocation. In this regard we generated estimates on the prevalence of diabetes for 1,188 Counties having a complete set of information and another 295 which were not covered in BRFSS survey and among the 1188 Counties 824 Counties that have smaller sample size (Healthy people 2020 data suppression for BRFSS [3]).

Unlike the direct method usually applied for such estimation the result in this analysis brought about statistical significance of the estimates in our study.

]]>A few scientific progresses have been achieved in ROC curves and its related fields over the past decades. In this dissertation, we propose a plug-in empirical likelihood (EL) procedure combining placement values and weighting of inverse probability techniques, to construct stable and precise confidence intervals of the ROC curves, the difference of two ROC curves, the AUC's and the difference of two AUC's with right censoring. We proved that the limiting distribution of the EL ratio is a weighted $\chi^2$ distribution. Furthermore, we introduce a jackknife empirical likelihood (JEL) procedure to explore the difference of two correlated VUS's with complete data. We proved that the limiting distribution of the proposed JEL ratio is a $\chi^2$ distribution, i.e., the Wilk's theorem holds. Extensive simulation studies demonstrate that the new methods have better performance than the existing methods in terms of coverage probability of confidence intervals in most cases. Finally, the proposed methods are applied to analyze data sets of Primary Biliary Cirrhosis (PBC), Alzheimer's disease, etc.

]]>We first examine the neuron cell growth process, which has implications in neural tissue regenerations, by using a computational model with uniform branching probability and a maximum overall length constraint. One crucial outcome is that we can relate the parameter fits from our model to real data from our experimental collaborators, in order to examine the usefulness of our model under different biological conditions. Our methods can now directly compare branching probabilities of different experimental conditions and provide confidence intervals for these population-level measures. In addition, we have obtained analytical results that show that the underlying probability distribution for this process follows a geometrical progression increase at nearby distances and an approximately geometrical series decrease for far away regions, which can be used to estimate the spatial location of the maximum of the probability distribution. This result is important, since we would expect maximum number of dendrites in this region; this estimate is related to the probability of success for finding a neural target at that distance during a blind search.

We then examined tumor growth processes which have similar evolutional evolution in the sense that they have an initial rapid growth that eventually becomes limited by the resource constraint. For the tumor cells evolution, we found an exponential growth model best describes the experimental data, based on the accuracy and robustness of models. Furthermore, we incorporated this growth rate model into logistic regression models that predict the growth rate of each patient with biomarkers; this formulation can be very useful for clinical trials. Overall, this study aimed to assess the molecular and clinic pathological determinants of breast cancer (BC) growth rate in vivo.

]]>For a specified patient population, cost estimates are generally determined from the beginning of treatments until death or end of the study period. A number of statistical methods have been proposed to estimate medical cost. Since medical cost data are skewed to the right, normal approximation based confidence intervals can have much lower coverage probability than the desired nominal level when the cost data are moderately or severely skewed. Additionally, we note that the variance estimators of the cost estimates are analytically complicated.

In order to address some of the above issues, in the first part of the dissertation we propose two empirical likelihood-based confidence intervals for the mean medical costs: One is an empirical likelihood interval (ELI) based on influence function, the other is a jackknife empirical likelihood (JEL) based interval. We prove that under very general conditions, *−*2*log *(empirical likelihood ratio) has an asymptotic standard chi squared distribution with one degree of freedom for mean medical cost. Also we show that the log-jackknife empirical likelihood ratio statistics follow standard *χ*2 distribution with one degree of freedom for mean medical cost.

In the second part of the dissertation, we propose an influence function-based empirical likelihood method to construct a confidence region for the vector of regression parameters in mean cost regression models with censored data. The proposed confidence region can be used to obtain a confidence interval for the expected total cost of a patient with given covariates. The new method has sound asymptotic property (Wilks Theorem).

In the third part of the dissertation we propose empirical likelihood method based on influence function to construct confidence intervals for quantile medical costs with censored data. We prove that under very general conditions,* −*2*log *(empirical likelihood ratio) has an asymptotic standard chi squared distribution with one degree of freedom for quantile medical cost. Simulation studies are conducted to compare coverage probabilities and interval lengths of the proposed confidence intervals with the existing confidence intervals. The proposed methods are observed to have better finite sample performances than existing methods. The new methods are also illustrated through a real example.

Consequently, my Ph.D. research focuses on a specific class of homogeneous Integrate-and-Fire neural network, for which analytical solutions of network dynamics can be derived. One crucial analytical finding is that the traveling wave acceleration quadratically depends on the instantaneous speed of the activity propagation, which means that two speed solutions exist in the activities of wave propagation: one is fast-stable and the other is slow-unstable.

Furthermore, via this property, we analytically compute temporal-spatial spiking dynamics to help gain insights into the stability mechanisms of traveling wave propagation. Indeed, the analytical solutions are in perfect agreement with the numerical solutions. This analytical method also can be applied to determine the effects induced by a non-conductive gap of brain tissue and extended to more general synaptic connectivity functions, by converting the evolution equations for network dynamics into a low-dimensional system of ordinary differential equations.

Building upon these results, we investigate how periodic inhomogeneities affect the dynamics of activity propagation. In particular, two types of periodic inhomogeneities are studied: alternating regions of additional fixed excitation and inhibition, and cosine form inhomogeneity. Of special interest are the conditions leading to propagation failure. With similar analytical procedures, explicit expressions for critical speeds of activity propagation are obtained under the influence of additional inhibition and excitation. However, an explicit formula for speed modulations is difficult to determine in the case of cosine form inhomogeneity. Instead of exact solutions from the system of equations, a series of speed approximations are constructed, rendering a higher accuracy with a higher order approximation of speed.

]]>The accuracy of facial recognition algorithms on images taken in controlled conditions has improved significantly over the last two decades. As the focus is turning to more unconstrained or relaxed conditions and toward videos, there is a need to better understand what factors influence performance. If these factors were better understood, it would be easier to predict how well an algorithm will perform when new conditions are introduced.

Previous studies have studied the effect of various factors on the verification rate (VR), but less attention has been paid to the false accept rate (FAR). In this dissertation, we study the effect various factors have on the FAR as well as the correlation between marginal FAR and VR. Using these relationships, we propose two models to predict marginal VR and demonstrate that the models predict better than using the previous global VR.

]]>The latter half focuses on the application of empirical likelihood method in economics and finance. Two models draw our attention. The first one is the predictive regression model with independent and identically distributed errors. Some uniform tests have been proposed in the literature without distinguishing whether the predicting variable is stationary or nearly integrated. Here, we extend the empirical likelihood methods in Zhu, Cai and Peng (2014) with independent errors to the case of an AR error process. The proposed new tests do not need to know whether the predicting variable is stationary or nearly integrated, and whether it has a finite variance or an infinite variance. Another model we considered is a GARCH(1,1) sequence or an AR(1) model with ARCH(1) errors. It is known that the observations have a heavy tail and the tail index is determined by an estimating equation. Therefore, one can estimate the tail index by solving the estimating equation with unknown parameters replaced by Quasi Maximum Likelihood Estimation (QMLE), and profile empirical likelihood method can be employed to effectively construct a confidence interval for the tail index. However, this requires that the errors of such a model have at least finite fourth moment to ensure asymptotic normality with n^{1/2} rate of convergence and Wilk's Theorem. We show that the finite fourth moment can be relaxed by employing some Least Absolute Deviations Estimate (LADE) instead of QMLE for the unknown parameters by noting that the estimating equation for determining the tail index is invariant to a scale transformation of the underlying model. Furthermore, the proposed tail index estimators have a normal limit with n^{1/2} rate of convergence under minimal moment condition, which may have an infinite fourth moment, and Wilk's theorem holds for the proposed profile empirical likelihood methods. Hence a confidence interval for the tail index can be obtained without estimating any additional quantities such as asymptotic variance.

The refined inertia of a square real matrix $B$, denoted $\ri(B)$, is the ordered $4$-tuple $(n_+(B), \ n_-(B), \ n_z(B), \ 2n_p(B))$, where $n_+(B)$ (resp., $n_-(B)$) is the number of eigenvalues of $B$ with positive (resp., negative) real part, $n_z(B)$ is the number of zero eigenvalues of $B$, and $2n_p(B)$ is the number of pure imaginary eigenvalues of $B$. The minimum rank (resp., rational minimum rank) of a sign pattern matrix $\cal A$ is the minimum of the ranks of the real (resp., rational) matrices whose entries have signs equal to the corresponding entries of $\cal A$.

First, we identify all minimal critical sets of inertias and refined inertias for full sign patterns of order 3. Then we characterize the star sign patterns of order $n\ge 5$ that require the set of refined inertias $\mathbb{H}_n=\{(0, n, 0, 0), (0, n-2, 0, 2), (2, n-2, 0, 0)\}$, which is an important set for the onset of Hopf bifurcation in dynamical systems. Finally, we establish a direct connection between condensed $m \times n $ sign patterns and zero-nonzero patterns with minimum rank $r$ and $m$ point-$n$ hyperplane configurations in ${\mathbb R}^{r-1}$. Some results about the rational realizability of the minimum ranks of sign patterns or zero-nonzero patterns are obtained.

]]>Second, as a generalization of (hyper)graph matchings, we determine the minimum vertex degree threshold asymptotically for perfect K_{a,b,c}-tlings in large 3-uniform hypergraphs, where K_{a,b,c} is any complete 3-partite 3-uniform hypergraphs with each part of size a, b and c. This partially answers a question of Mycroft, who proved an analogous result with respect to codegree for r-uniform hypergraphs for all r ≥ 3. Our proof uses Regularity Lemma, the absorbing method, fractional tiling, and a recent result on shadows for 3-graphs.

The second part explores some connections of dense alternating sign matrices with total unimodularity, combined matrices, and generalized complementary basic matrices.

In the third part of the dissertation, an explicit formula for the ranks of dense alternating sign matrices is obtained. The minimum rank and the maximum rank of the sign pattern of a dense alternating sign matrix are determined. Some related results and examples are also provided.

]]>Second, we consider Hamilton cycles in hypergraphs. In particular, we determine the minimum codegree thresholds for Hamilton l-cycles in large k-uniform hypergraphs for l less than k/2. We also determine the minimum vertex degree threshold for loose Hamilton cycle in large 3-uniform hypergraphs. These results generalize the well-known theorem of Dirac for graphs.

Third, we determine the minimum codegree threshold for near perfect matchings in large k-uniform hypergraphs, thereby confirming a conjecture of Rodl, Rucinski and Szemeredi. We also show that the decision problem on whether a k-uniform hypergraph with certain minimum codegree condition contains a perfect matching can be solved in polynomial time, which solves a problem of Karpinski, Rucinski and Szymanska completely.

At last, we determine the minimum vertex degree threshold for perfect tilings of C_4^3 in large 3-uniform hypergraphs, where C_4^3 is the unique 3-uniform hypergraph on four vertices with two edges.

]]>