About 50 results
Open links in new tab
  1. python - Why does Dask perform so slower while multiprocessing …

    Sep 6, 2019 · 36 dask delayed 10.288054704666138s my cpu has 6 physical cores Question Why does Dask perform so slower while multiprocessing perform so much faster? Am I using Dask the wrong …

  2. Reading an SQL query into a Dask DataFrame - Stack Overflow

    May 24, 2022 · I'm trying create a function that takes an SQL SELECT query as a parameter and use dask to read its results into a dask DataFrame using the dask.read_sql_query function. I am new to …

  3. How to transform Dask.DataFrame to pd.DataFrame?

    Aug 18, 2016 · How can I transform my resulting dask.DataFrame into pandas.DataFrame (let's say I am done with heavy lifting, and just want to apply sklearn to my aggregate result)?

  4. dask: looping over groupby groups efficiently - Stack Overflow

    Mar 25, 2025 · How can I efficiently perform a groupby operation on a Dask DataFrame without loading everything into memory or computing multiple times? Is there an updated best practice for this in 2025?

  5. How to specify correct dtype for column of lists when creating a dask ...

    Oct 9, 2023 · When creating a dask Dataframe with the from_pandas method, the formerly correct dtype object becomes a string[pyarrow]. import dask.dataframe as dd import pandas as pd df = …

  6. python - Difference between dask.distributed LocalCluster with threads ...

    Sep 2, 2019 · What is the difference between the following LocalCluster configurations for dask.distributed? Client(n_workers=4, processes=False, threads_per_worker=1) versus …

  7. How to Set Dask Dashboard Address with SLURMRunner (Jobqueue) …

    Dec 17, 2024 · I am trying to run a Dask Scheduler and Workers on a remote cluster using SLURMRunner from dask-jobqueue. I want to bind the Dask dashboard to 0.0.0.0 (so it’s accessible …

  8. python - Why does dask take long time to compute regardless of the …

    Mar 24, 2022 · The reason dask dataframe is taking more time to compute (shape or any operation) is because when a compute op is called, dask tries to perform operations from the creation of the …

  9. python - Using Matplotlib with Dask - Stack Overflow

    Jul 15, 2022 · Don't use Matplotlib, use hvPlot! If you wish to plot the data while it's still large, I recommend using hvPlot, as it can natively handle dask dataframes. It also automatically provides …

  10. What is the logic behind "compute ()" in Dask dataframes?

    May 23, 2021 · Each partition in a Dask DataFrame is a Pandas DataFrame. compute() combines all the partitions (Pandas DataFrames) into a single Pandas DataFrame. Dask is fast because it can …