40 lines
1.3 KiB
ReStructuredText
40 lines
1.3 KiB
ReStructuredText
.. _california_housing_dataset:
|
|
|
|
California Housing dataset
|
|
--------------------------
|
|
|
|
**Data Set Characteristics:**
|
|
|
|
:Number of Instances: 20640
|
|
|
|
:Number of Attributes: 8 numeric, predictive attributes and the target
|
|
|
|
:Attribute Information:
|
|
- MedInc median income in block
|
|
- HouseAge median house age in block
|
|
- AveRooms average number of rooms
|
|
- AveBedrms average number of bedrooms
|
|
- Population block population
|
|
- AveOccup average house occupancy
|
|
- Latitude house block latitude
|
|
- Longitude house block longitude
|
|
|
|
:Missing Attribute Values: None
|
|
|
|
This dataset was obtained from the StatLib repository.
|
|
http://lib.stat.cmu.edu/datasets/
|
|
|
|
The target variable is the median house value for California districts.
|
|
|
|
This dataset was derived from the 1990 U.S. census, using one row per census
|
|
block group. A block group is the smallest geographical unit for which the U.S.
|
|
Census Bureau publishes sample data (a block group typically has a population
|
|
of 600 to 3,000 people).
|
|
|
|
It can be downloaded/loaded using the
|
|
:func:`sklearn.datasets.fetch_california_housing` function.
|
|
|
|
.. topic:: References
|
|
|
|
- Pace, R. Kelley and Ronald Barry, Sparse Spatial Autoregressions,
|
|
Statistics and Probability Letters, 33 (1997) 291-297
|