Fusion - get started¶
In [3]:
Copied!
import pandas as pd
from fusion import Fusion
import matplotlib.pyplot as plt
plt.style.use("bmh")
import pandas as pd
from fusion import Fusion
import matplotlib.pyplot as plt
plt.style.use("bmh")
Establish the connection¶
In [4]:
Copied!
fusion = Fusion()
fusion = Fusion()
Show the available functionality¶
In [5]:
Copied!
fusion
fusion
Out[5]:
Fusion object Available methods: +------------------------------+--------------------------------------------------------------------------------------------------+ | attribute | Instantiate an Attribute object with this client for metadata creation. | | attributes | Instantiate an Attributes object with this client for metadata creation. | | catalog_resources | List the resources contained within the catalog, for example products and datasets. | | create_dataset_lineage | Upload lineage to a dataset. | | dataset | Instantiate a Dataset object with this client for metadata creation. | | dataset_resources | List the resources available for a dataset, currently this will always be a datasetseries. | | datasetmember_resources | List the available resources for a datasetseries member. | | delete_all_datasetmembers | Delete all dataset members within a dataset. | | delete_datasetmembers | Delete dataset members. | | download | Downloads the requested distributions of a dataset to disk. | | from_bytes | Uploads data from an object in memory. | | get_events | Run server sent event listener and print out the new events. Keyboard terminate to stop. | | get_fusion_filesystem | Creates Fusion Filesystem. | | list_catalogs | Lists the catalogs available to the API account. | | list_dataset_attributes | Returns the list of attributes that are in the dataset. | | list_dataset_lineage | List the upstream and downstream lineage of the dataset. | | list_datasetmembers | List the available members in the dataset series. | | list_datasets | Get the datasets contained in a catalog. | | list_distributions | List the available distributions (downloadable instances of the dataset with a format type). | | list_product_dataset_mapping | get the product to dataset linking contained in a catalog. A product is a grouping of datasets. | | list_products | Get the products contained in a catalog. A product is a grouping of datasets. | | listen_to_events | Run server sent event listener in the background. Retrieve results by running get_events. | | product | Instantiate a Product object with this client for metadata creation. | | to_bytes | Returns an instance of dataset (the distribution) as a bytes object. | | to_df | Gets distributions for a specified date or date range and returns the data as a dataframe. | | to_table | Gets distributions for a specified date or date range and returns the data as an arrow table. | | upload | Uploads the requested files/files to Fusion. | | default_catalog | Returns the default catalog. | +------------------------------+--------------------------------------------------------------------------------------------------+
Access function documentation¶
In [6]:
Copied!
fusion.to_df?
fusion.to_df?
Signature: fusion.to_df( dataset: 'str', dt_str: 'str' = 'latest', dataset_format: 'str' = 'parquet', catalog: 'str | None' = None, n_par: 'int | None' = None, show_progress: 'bool' = True, columns: 'list[str] | None' = None, filters: 'PyArrowFilterT | None' = None, force_download: 'bool' = False, download_folder: 'str | None' = None, dataframe_type: 'str' = 'pandas', **kwargs: 'Any', ) -> 'pd.DataFrame' Docstring: Gets distributions for a specified date or date range and returns the data as a dataframe. Args: dataset (str): A dataset identifier dt_str (str, optional): Either a single date or a range identified by a start or end date, or both separated with a ":". Defaults to 'latest' which will return the most recent instance of the dataset. dataset_format (str, optional): The file format, e.g. CSV or Parquet. Defaults to 'parquet'. catalog (str, optional): A catalog identifier. Defaults to 'common'. n_par (int, optional): Specify how many distributions to download in parallel. Defaults to all cpus available. show_progress (bool, optional): Display a progress bar during data download Defaults to True. columns (List, optional): A list of columns to return from a parquet file. Defaults to None filters (List, optional): List[Tuple] or List[List[Tuple]] or None (default) Rows which do not match the filter predicate will be removed from scanned data. Partition keys embedded in a nested directory structure will be exploited to avoid loading files at all if they contain no matching rows. If use_legacy_dataset is True, filters can only reference partition keys and only a hive-style directory structure is supported. When setting use_legacy_dataset to False, also within-file level filtering and different partitioning schemes are supported. More on https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetDataset.html force_download (bool, optional): If True then will always download a file even if it is already on disk. Defaults to False. download_folder (str, optional): The path, absolute or relative, where downloaded files are saved. Defaults to download_folder as set in __init__ dataframe_type (str, optional): Type Returns: class:`pandas.DataFrame`: a dataframe containing the requested data. If multiple dataset instances are retrieved then these are concatenated first. File: ~/fusion/py_src/fusion/fusion.py Type: method
View Catalogs¶
In [7]:
Copied!
fusion.list_catalogs()
fusion.list_catalogs()
Out[7]:
identifier | description | @id | isInternal | title | |
---|---|---|---|---|---|
0 | common | A catalog of common data | common/ | False | Common |
1 | fusiondemo | A catalog of fusion demo data | fusiondemo/ | False | Fusion Demo |
Explore the datasets¶
In [8]:
Copied!
fusion.list_datasets("FX")
fusion.list_datasets("FX")
Out[8]:
identifier | title | containerType | region | category | description | status | |
---|---|---|---|---|---|---|---|
0 | FXO_SP | FX Cash Rate | Snapshot-Full | EMEA, North America, Emerging Markets, APAC, G... | FX | This dataset includes FX spot rates for major ... | Subscribed |
3 | FXO_ST | FX Option Structure | Strangles | Snapshot-Full | EMEA, North America, APAC, Emerging Markets, G... | FX | Implied volatility for 10 and 25 delta FX opti... | Subscribed |
17 | FX_INTRADAY_FWD_G10 | FX INTRADAY FORWARDS - G10 | Time-Series-Snapshot-Full | Global | FX | FX Forwards Intraday Dataset\n\n | Available |
33 | JPM_FX_Forwards_EOD | JPM FX Forwards End of Day | Time-Series-Delta | Americas, Emerging Markets, Global | FX | JPM End of Day marks for Forwards | Available |
60 | FXO_VOL_INTRA_EM | FXO Intraday Implied Volatility Snaps - EM | Snapshot-Full | Emerging Markets, Global | FX | J.P. Morgan’s FXO Intraday Implied Volatility ... | Subscribed |
94 | STANDARD_MARKET_VALUATION_DETAIL_CCY_CONTRACTS | Sample: Market Valuation Detail Currency Contr... | Snapshot-Full | Global | Fund Accounting | This dataset exclusively shows all pending spo... | Available |
98 | FXO_RR | FX Option Structure | Risk Reversal | Snapshot-Full | EMEA, North America, APAC, Emerging Markets, G... | FX | Implied volatility for 10 and 25 delta FX opti... | Subscribed |
114 | MO_Standard_Closed_Tax_Lot | MO Standard Closed Tax Lot | Snapshot-Full | Global | Middle Office | This dataset provides details at the Tax Lot l... | Available |
115 | FX_ECONOMIC | FX Specialized | Momentum Strategies (Economics) | Snapshot-Full | EMEA, North America, APAC, Emerging Markets, G... | FX | Momentum signals in a trend following strategy... | Subscribed |
121 | FXO_VOL_EOD_EM | FXO End of Day Implied Volatility - EM | Snapshot-Full | Emerging Markets, Global | FX | J.P. Morgan’s FXO End of Day Implied Volatilit... | Subscribed |
123 | STANDARD_FUTURE_VALUED_FX_CONTRACTS | Sample: Future Valued FX Contracts | Snapshot-Full | Global | Custody | Standard Future Valued FX Contracts dataset di... | Available |
165 | STANDARD_CLIENT_STANDING_INSTRUCTIONS | Sample: Client Standing Instructions | Snapshot-Full | Global | Custody | Standard dataset detailing client standing ins... | Available |
183 | FXO_VOL_INTRA_G10 | FXO Intraday Implied Volatility Snaps - G10 | Snapshot-Full | Global | FX | J.P. Morgan’s FXO Intraday Implied Volatility ... | Available |
239 | FX_JPM_TCI | FX Passive Index | Snapshot-Full | EMEA, North America, APAC, Global | FX | FX passive index level and currency sub-indices. | Subscribed |
240 | FX_EASIDX | Economic Activity Surprise Index (EASI) FX | Snapshot-Full | EMEA, North America, Emerging Markets, APAC, G... | Economic | The Economic Activity Surprise Index is publis... | Subscribed |
258 | FXO_VOL_EOD_G10 | FXO End of Day Implied Volatility - G10 | Snapshot-Full | Global | FX | J.P. Morgan’s FXO End of Day Implied Volatilit... | Subscribed |
259 | MO_Standard_Position_Summary | MO Standard Position Summary | Snapshot-Full | Global | Middle Office | This dataset provides Portfolio level valuatio... | Available |
276 | STANDARD_CLS_FX_TRANSACTIONS | Sample: CLS FX Transactions | Snapshot-Full | Global | Custody | Standard dataset detailing all CLS FX trade ac... | Available |
278 | JPM_FX_Spot_EOD | JPM FX Spot End of Day | Time-Series-Full | Americas, EMEA, APAC | FX | JPM FX Spot rates sourced from trading desks | Available |
287 | JPM_FX_Vols_EOD | JPM FX Vols End Of Day | Time-Series-Full | Americas, Emerging Markets, Global | FX | FX EOD Vols across G10 and EM | Available |
296 | STANDARD_VALUED_HOLDINGS | Sample: Valued Holdings | Snapshot-Full | Global | Middle Office | Provides market value in local currency of the... | Available |
348 | STANDARD_VALUED_HOLDINGS_SUMMARY | Sample: Valued Holdings Summary | Snapshot-Full | Global | Middle Office | Provides portfolio level valuation in a specif... | Available |
366 | FX_MEAN_HFFV | FX Mean Reversion Strategies Hi Freq Fair Value | Snapshot-Full | EMEA, North America, APAC, Global | FX | The FX High Frequency Fair Value dataset from ... | Subscribed |
409 | FX_MEAN_IMM | FX Mean Reversion Strategies IMM | Snapshot-Full | EMEA, North America, APAC, Emerging Markets, G... | FX | The FX Mean Reversion, IMM dataset from J.P. M... | Subscribed |
Display the attributes¶
In [9]:
Copied!
fusion.list_dataset_attributes("FXO_SP")
fusion.list_dataset_attributes("FXO_SP")
Out[9]:
isDatasetKey | identifier | description | title | dataType | |
---|---|---|---|---|---|
0 | True | instrument_name | The instrument name | Instrument Name | String |
1 | False | currency_pair | The currency pair | Currency Pair | String |
2 | False | term | The time period of an investment, agreement or... | Term | String |
3 | False | product | The product identifier | Product | String |
4 | False | date | The snapshot date | Date | String |
5 | False | fx_rate | The spot and forward fx rate | FX Rate | Double |
Display the dataset members¶
In [10]:
Copied!
fusion.list_datasetmembers("FXO_SP")
fusion.list_datasetmembers("FXO_SP")
Out[10]:
identifier | createdDate | fromDate | @id | toDate | |
---|---|---|---|---|---|
0 | 20190101 | 2019-01-01 | 2019-01-01 | 20190101/ | 2019-01-01 |
1 | 20190102 | 2019-01-02 | 2019-01-02 | 20190102/ | 2019-01-02 |
2 | 20190103 | 2019-01-03 | 2019-01-03 | 20190103/ | 2019-01-03 |
3 | 20190104 | 2019-01-04 | 2019-01-04 | 20190104/ | 2019-01-04 |
4 | 20190107 | 2019-01-07 | 2019-01-07 | 20190107/ | 2019-01-07 |
... | ... | ... | ... | ... | ... |
1498 | 20241018 | 2024-10-21 | 2024-10-18 | 20241018/ | 2024-10-18 |
1499 | 20241021 | 2024-10-22 | 2024-10-21 | 20241021/ | 2024-10-21 |
1500 | 20241022 | 2024-10-23 | 2024-10-22 | 20241022/ | 2024-10-22 |
1501 | 20241023 | 2024-10-24 | 2024-10-23 | 20241023/ | 2024-10-23 |
1502 | 20241024 | 2024-10-25 | 2024-10-24 | 20241024/ | 2024-10-24 |
1503 rows × 5 columns
Display the available distributions for a file¶
In [11]:
Copied!
fusion.list_distributions("FXO_SP", "20241024")
fusion.list_distributions("FXO_SP", "20241024")
Out[11]:
identifier | fileExtension | mediaType | @id | title | description | |
---|---|---|---|---|---|---|
0 | csv | .csv | text/csv; header=present; charset=utf-8 | csv/ | CSV | Snapshot data will be in a tabular, comma sepa... |
1 | parquet | .parquet | application/parquet; header=present | parquet/ | Parquet | Snapshot data will be in a parquet format. |
Download and load¶
In [12]:
Copied!
df = fusion.to_df("FXO_SP", "20241001:20241024", dataset_format="csv", columns=["currency_pair", "date", "fx_rate"], filters=[("currency_pair", "=", "GBPUSD")])
df = fusion.to_df("FXO_SP", "20241001:20241024", dataset_format="csv", columns=["currency_pair", "date", "fx_rate"], filters=[("currency_pair", "=", "GBPUSD")])
Output()
Analyze¶
In [13]:
Copied!
df.head()
df.head()
Out[13]:
currency_pair | date | fx_rate | |
---|---|---|---|
0 | GBPUSD | 20241001 | 1.32885 |
1 | GBPUSD | 20241002 | 1.32505 |
2 | GBPUSD | 20241003 | 1.31075 |
3 | GBPUSD | 20241004 | 1.31185 |
4 | GBPUSD | 20241007 | 1.30785 |
In [14]:
Copied!
df["date"] = pd.to_datetime(df["date"].astype("str"))
df.sort_values("date").set_index("date").plot(grid=True);
df["date"] = pd.to_datetime(df["date"].astype("str"))
df.sort_values("date").set_index("date").plot(grid=True);