Generalized Linear Models - Jean-Francois Dupuy - E-Book

Generalized Linear Models E-Book

Jean-Francois Dupuy

0,0
142,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Since they were first formulated in 1972, generalized linear models have enjoyed a veritable boom, with numerous applications in insurance, economics and biostatistics. Today, they are still the subject of a great deal of research.

This book provides an overview of the theory of generalized linear models. Particular attention is paid to the problems of censoring, missing data and excess zeros. Didactic and accessible, Generalized Linear Models is illustrated with exercises and numerous R codes.

With all the necessary prerequisites introduced in a step-by-step fashion, this book is aimed at students (at master's or engineering school level), as well as teachers and practitioners of mathematics and statistical modeling.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 281

Veröffentlichungsjahr: 2025

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



To my parents, To my daughter Laura, To Marielle and Gaële

Series Editor Nikolaos Limnios

Generalized Linear Models

Problems with Censored, Missing, and Zero-inflated Data

Jean-François Dupuy

First published 2025 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

www.iste.co.uk

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.wiley.com

© ISTE Ltd 2025 The rights of Jean-François Dupuy to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s), contributor(s) or editor(s) and do not necessarily reflect the views of ISTE Group.

Library of Congress Control Number: 2025932421

British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISBN 978-1-78630-702-6

Preface

Since their formulation by Nelder and Wedderburn in the Journal of the Royal Statistical Society: Series A in 1972, generalized linear models (GLMs) have become one of the cornerstones of statistical modeling. They have spawned – and continue to do so – an abundance of literature, powered by theoretical and methodological questions, or their links to applications. This work is dedicated to them. Of course, it does not have the aim of providing an exhaustive account of this literature. Several of the themes covered in the chapters deserve to be the subject of a work in their own right. This is the case for a question with missing data, or when a censored response needs to be taken into account. The aim of this work (and of the author) is therefore more modest, seeking to describe some problems that involve GLMs which have been recently studied (missing data, censored data and excess zeros) and to report, once again without claiming to be exhaustive, the solutions that arose with them.

The subjects covered in this work do not therefore cover the immense variety of contributions to the literature on GLMs. We will not find, for example, a chapter dedicated to questions regarding the validation of the models, or of variable selection in a high dimension. In fact, the choice of subjects covered involves some degree of subjectivity, and largely reflects the author’s centers of interest. Another imperative guided the writing of this work and the choice of its content: the methods described there can, for the most part, be implemented using dedicated functions, immediately available within packages of the statistical and data analysis software R (a free and open-source software). Certain methods require a little programming work, and R code examples are provided throughout the book.

Equally, note that a majority of the problems described here (such as the problems with missing or censored data) do not only arise in GLMs. We have therefore tried to give a sufficiently general description of the solutions proposed here, so that the reader will understand the main principles and can either apply – or adapt – them in other contexts. Finally, this work has been written in such a way that it can be approached by those with different levels of understanding. It is thus possible to comprehend the content without dwelling on the theoretical proofs which are, for the most part, provided in appendices in each chapter. Although these chapters are written such that they can be read (almost) independently of one another, it is recommended nevertheless to go through them in the order given in the table of contents, as it follows an increasing progression with regard to the difficulty of the methods described. A data set (available under R) serves as a common thread throughout the book, and it is described in section 2.5, which should be read before moving onto the other sections that use this set of data.

My warmest thanks go to Nikolaos Limnios, for his constant encouragement since my thesis, for inviting me to write this work and for the additions he has suggested.

I would also like to sincerely thank Ben Brown, who did a tremendous job translating this book from French to English.

Jean-François DupuyMarch 2025

Notation and Acronyms

Mean and variance of

X

cov(

X, Y

)

Covariance of X and

Y

Empirical mean of a sample

Y

1

, …,

Y

n

n

Empirical measure

Empirical process

Convergence in distribution

Convergence in probability

Almost sure convergence

o

,

O

Stochastic-order symbols

Sets of the real numbers, positive reals, strictly positive reals

Binomial, Bernoulli, negative binomial distributions

Poisson and generalized Poisson distributions

Normal, Student’s

t

-, chi-squared, Fisher distributions

Gamma, exponential, inverse-Gaussian distributions

u

α

,

t

n

(

α

),

c

α

(

q

),

f

q,p

(

α

)

Quantiles of order

α

of the distributions

Φ

Cumulative distribution function of the distribution

1{·}

Indicator function

Transpose of the matrix (or vector)

A

AIC

Akaike’s information criterion

AIPW

Augmented inverse probability weighted

BIC

Bayesian information criterion

EM

Expectation-maximization (algorithm)

GLM

Generalized linear models

i.i.d.

Independent and identically distributed (random variables)

IPW

Inverse probability weighted

LSE

Least-squares estimator

MAR

Missing at random

MCAR

Missing completely at random

MLE

Maximum likelihood estimator

MNAR

Missing not at random

MZINB

Marginal zero-inflated negative binomial

MZIP

Marginal zero-inflated Poisson

RSS

Residual sum of squares

s.e.

Standard error

ZIB

Zero-inflated binomial

ZIGP

Zero-inflated generalized Poisson

ZINB

Zero-inflated negative binomial

ZIP

Zero-inflated Poisson